Translate

Monday, September 12, 2016

Multi-threaded Asynchronous File Copy Class for C#

It used to be that, since file I/O was your biggest bottleneck, copying files on multiple threads was inefficient, even slowing down the file copy.  Multiple simultaneous copy jobs would spend the most time just positioning read heads around disk sectors.  Modern disks, however, have vastly improved read-ahead caching.  SSD drives eliminate much of the time physical disks would spend positioning the read head.  If you have multiple disks in a RAID array or advanced SAN, this reduces the I/O bottleneck for simultaneous file copies.

In testing, I have found that multi-threaded copying cuts the time to copy multiple files anywhere from twenty to fifty percent over copying the files individually.  This holds true whether copying locally, across a LAN or WAN.  The TestCopyForm project linked below has a sample application with this class where you can adjust buffer size and specify whether to use multi-threading to copy files or copy them individually.  You can test with various combinations to see what works best for your environment.

Multi-Threaded Async File Copy Class

To use this class, add it to your project.  When you initialize an instance of the class, pass it either a single file source path and destination path, create a Queue of KeyValueObjects to pass multiple files, or use a simple new object initialization and add the files through the CopyList queue.

FileCopyMTStream fcs = new FileCopyMTStream();

string SourcePath = "c:\\somedir";
string TargetPath = "c:\\someotherdir";
string targetfile;

foreach (string sourcefile in Directory.GetFiles(SourcePath))
{
    targetpath = Path.Combine(TargetPath, Path.GetFileName(filepath));
    fcs.CopyList.Enqueue(new KeyValuePair<string, string>(filepath, targetpath));
}


Next, add event handlers to your code to handle status updates:

fcs.FileCopyStarted += Fcs_FileCopyStarted;
fcs.ProgressChanged += Fcs_ProgressChanged;
fcs.FileCopyComplete += Fcs_FileCopyComplete;
fcs.FileCopyException += Fcs_FileCopyException;


When you write the event handlers, make sure that you use Invoke to interact with the UI thread - the event handlers will be running under the context of one of the FileCopyMTStream threadpool threads.  If you try to access a control from that thread, it will throw an error.


private void Fcs_ProgressChanged(string SourceFile, int Percentage, long CopiedSize, long FileSize)
{
    Invoke(new MethodInvoker(delegate 
    {
        if (lvFileList.Items.ContainsKey(SourceFile))
        {
            lvFileList.Items[SourceFile].SubItems[3].Text = CopiedSize.ToString("N0") + " copied";
            lvFileList.Items[SourceFile].SubItems[4].Text = Percentage.ToString() + "%";
        }
    }
    ));
}


When you are ready to kick off the copy, call the following method:


fcs.StartCopy();


Here is a test project that includes the class:


TestCopyForm.zip



Here is the class.  Enjoy!

using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;

// FileCopyMTStream - File Copy Multi-Threaded Stream
// Adrian Hayes
// http://www.bearnakedcode.com
//
// A class which asynchronously copies one or more files using multiple
// threadpool threads.  It allows file copies without locking UI thread and
// provides status reports via events.
//
// License:  You are free to use this code in any of your projects as you wish.
//           It is offered without warranty.  Use at your own risk.
//           If you publish its source (or any portion thereof), please include
//           a reference back to http://www.bearnakedcode.com.
//
// Enjoy!

namespace BearNakedCode
{

    //Delegates for events
    public delegate void FileCopyStartedDelegate(string SourceFile, long FileSize);
    public delegate void FileCopyExceptionDelegate(string SourceFile, Exception Error);
    public delegate void CompleteDelegate(string SourceFile, bool Canceled);
    public delegate void ProgressChangedDelegate(string SourceFile, int Percentage, long CopiedSize, long FileSize);

    /// <summary>
    /// Copies a queue of files asynchronously with multi-threading.
    /// </summary>
    public class FileCopyMTStream
    {

        #region PublicVariables

        /// <summary>
        /// A queue of files to copy.  Elements are KeyValuePairs with Key = Source / Value = Destination.
        /// </summary>
        public Queue<KeyValuePair<string, string>> CopyList = new Queue<KeyValuePair<string, string>>();


        /// <summary>
        /// Will copy using multiple threads, managed by threadpool(default).
        /// If false, copy is limited to 1 thread.
        /// </summary>
        public bool MultiThreaded = true;


        /// <summary>
        /// Size of the copy buffer.  Default is 4 KB buffer.  
        /// </summary>
        public int BufferSize = 4 * 1024;


        /// <summary>
        /// When reporting progress, it will wait until it has copied this amount of
        /// bytes before raising the ProgressChanged event.  This is to reduce flicker
        /// in controls from overly-frequent status updates.  Default is 128K bytes.
        /// </summary>
        public long ReportInterval = (128 * 1024); // report only on each 128K block

        #endregion

        //provide cancellation method
        private static bool CancelRequested = false;


        #region EventHandlers

        /// <summary>
        /// Event that fires when progress has changed on a file copy.  
        /// </summary>
        public event ProgressChangedDelegate ProgressChanged;
        private void OnProgressChanged(string SourceFile, long CopiedSize, long FileSize)
        {
            int pct = (int)((double)CopiedSize * 100 / (double)FileSize);
            //int pct = (int)(Math.Round((double)CopiedSize / (double)FileSize) * 100);
            if (ProgressChanged != null)
                ProgressChanged(SourceFile, pct, CopiedSize, FileSize);
        }


        /// <summary>
        /// Event that fires when a file copy has started.
        /// </summary>
        public event FileCopyStartedDelegate FileCopyStarted;
        private void OnFileCopyStarted(string SourceFile, long FileSize)
        {
            if (FileCopyStarted != null)
                FileCopyStarted(SourceFile, FileSize);
        }

        /// <summary>
        /// Event that fires when a fatal exception occurs on a file copy.
        /// </summary>
        public event FileCopyExceptionDelegate FileCopyException;
        private void OnFileCopyException(string SourceFile, Exception Error)
        {
            if (FileCopyException != null)
                FileCopyException(SourceFile, Error);
        }

        /// <summary>
        /// Event that fires when a copy job finishes or is cancelled.
        /// </summary>
        public event CompleteDelegate FileCopyComplete;
        private void OnFileCopyComplete(string SourceFile, bool Cancelled)
        {
            if (FileCopyComplete != null)
                FileCopyComplete(SourceFile, Cancelled);
        }
        #endregion

        #region Initializers

        public FileCopyMTStream()
        {

        }

        /// <summary>
        /// Initializes a new instance of FileCopyMTStream for a single file copy.
        /// </summary>
        /// <param name="Source">Full path to the source file.</param>
        /// <param name="Destination">Full path to the destination file, including file name.</param>
        public FileCopyMTStream(string Source, string Destination)
        {
            CopyList.Enqueue(new KeyValuePair<string, string>(Source, Destination));
        }


        /// <summary>
        /// Initializes a new instance of FileCopyMTStream for multiple file copies.
        /// </summary>
        /// <param name="SourceDestinationList"></param>
        public FileCopyMTStream(Queue<KeyValuePair<string, string>> SourceDestinationList)
        {
            CopyList = SourceDestinationList;
        }

        #endregion

        /// <summary>
        /// Set to True to request all remaining copy jobs cancel at their current progress, close their streams and exit.
        /// </summary>
        public void CancelAll()
        {
            CancelRequested = true;
        }


        /// <summary>
        /// Begins the asynchronous file copies.
        /// </summary>
        public void StartCopy()
        {
            WaitCallback callback;

            KeyValuePair<string, string> job;

            if (MultiThreaded == false)
            {
                new Thread(delegate () { SingleThreadCopyJob(CopyList); }).Start();
            }
            else
            {
                while (CopyList.Count > 0)
                {
                    job = CopyList.Dequeue();
                    if (CheckCancelRequested(job.Key))
                        return;

                    callback = new WaitCallback(CopyJob);
                    ThreadPool.QueueUserWorkItem(callback, job);
                }

            }
        }

        /// <summary>
        /// Copies a file.  Used in thread pool operations to copy multiple files simultaneously. 
        /// </summary>
        /// <param name="job">A source/target pair to copy a file.</param>
        private void CopyJob(object job)
        {
            //copyjob - KVP.  Key = source file.  Value = destination file
            KeyValuePair<string, string> copyjob = (KeyValuePair<string, string>)job;

            if (CheckCancelRequested(copyjob.Key))
                return;

            CopyFileStream(copyjob);
        }

        /// <summary>
        /// Copies all files, one at a time, on the same thread.  
        /// Use where multiple I/O streams would present a bottleneck.
        /// </summary>
        /// <param name="JobQueue">A queue of source/target pairs of files to copy.</param>
        private void SingleThreadCopyJob(Queue<KeyValuePair<string, string>> JobQueue)
        {
            KeyValuePair<string, string> job;
            while (CopyList.Count > 0)
            {
                job = CopyList.Dequeue();
                if (CheckCancelRequested(job.Key))
                    return;
                CopyFileStream(job);
            }

        }

        /// <summary>
        /// Checks if the CancelRequested flag has been raised.  
        /// Sends copy Complete event notification if cancelled.
        /// </summary>
        /// <param name="SourceFile">The file currently set to be copied.</param>
        /// <returns>True if the cancel flag set.  False if OK to copy the file.</returns>
        private bool CheckCancelRequested(string SourceFile)
        {
            if (CancelRequested)
            {
                OnFileCopyComplete(SourceFile, true);
                return true;
            }

            return false;
        }


        /// <summary>
        /// Uses a FileStream to copy a file from source to destination.  
        /// Sends event notifications on progress and completion.  
        /// </summary>
        /// <param name="CopyJob">
        /// A Key Value pair of Source and Target paths for the file to copy.
        /// </param>
        private void CopyFileStream(KeyValuePair<string, string> CopyJob)
        {
            byte[] buffer = new byte[BufferSize]; // 16K buffer

            long reporttally = 0;

            using (FileStream source = new FileStream(CopyJob.Key, FileMode.Open, FileAccess.Read))
            {
                long filelength = source.Length;

                OnFileCopyStarted(CopyJob.Key, filelength);

                using (FileStream dest = new FileStream(CopyJob.Value, FileMode.Create, FileAccess.Write))
                {
                    long totalBytes = 0;
                    int currentBlockSize = 0;

                    while ((currentBlockSize = source.Read(buffer, 0, buffer.Length)) > 0)
                    {
                        if (CheckCancelRequested(CopyJob.Key))
                            return;

                        dest.Write(buffer, 0, currentBlockSize);

                        totalBytes += currentBlockSize;

                        reporttally += currentBlockSize;
                        if (reporttally >= ReportInterval)
                        {
                            OnProgressChanged(CopyJob.Key, totalBytes, filelength);
                            reporttally = 0;
                        }

                    }
                }
            }

            OnFileCopyComplete(CopyJob.Key, CancelRequested);
        }
    }
}

1 comment:

Unknown said...

Hi,can you help me? I tried to use your class in program that copy lots of files throw the local network, some of them may be very big like 1-10Gb. But program crashes even on small ones with message "System.OutOfMemoryException"
on string:
byte[] buffer = new byte[BufferSizeInKB * 1024]; // 4K buffer

What can it be? How to solve that? Thanks!