Teracopy to support multi-threading when copying/verifying

Avatar
  • updated
  • Started

I suggest that Teracopy supports multithreading. As you know when copying small files multithreading support will speed things up instead of serial copy.

Duplicates 2
multiple threads using multiple core to copy or move

use more threads, like the robocopy.

Parallel File Copy

In some cases, copying many files in the same time (in parallel) will be faster than copying one after another. For example, when using drive pool software or some network file transfer, parallel copying sometimes can speed even up to 2x times faster.

Hope this function can come true someday.

If you have any questions, please feel free to email: zhangsiyuan@mail.ustc.edu.cn.

Avatar
geraud dumont

When copying files on sharepoint server, the speed per channel is very slow, but the server can handel several opened channel in parallel. Could you put a number of concomitant files (by default at 1) in setting menu ?

Avatar
-1
Yuzhy

Keep in mind transferring multiple files simultaneously will turn a sequential read/write job into lots of random read/write jobs and your hard drives will become your bottle neck.


However I am finding even when transferring and verifying single large files Teracopy seems very slow on systems capable of Giga Bytes per second of sequential read/write (eg. by using large numbers of HDD in Raid arrays or NVMe SSDs) and 10GbE+ networks.


For example, doing a hash verification on a 50GB file goes at only 350MB/sec on a Windows 10 VM with CPU utilization at only 10% when Crystal Disk Mark on the same system can hit over 2000MB/sec on sequential read and write when using 4 threads.  Even Windows 10 transfers are more than double that of Teracopy's.  CPU core counts are going up and up, 10GbE networks are becoming accessible and NVMe SSD performance are starting to hit the PCIe Gen3x4 bandwidth limit.  There should be a lot of room to optimize performance by better use of multi-threading.

Avatar
Martin
Quote from Yuzhy

Keep in mind transferring multiple files simultaneously will turn a sequential read/write job into lots of random read/write jobs and your hard drives will become your bottle neck.


However I am finding even when transferring and verifying single large files Teracopy seems very slow on systems capable of Giga Bytes per second of sequential read/write (eg. by using large numbers of HDD in Raid arrays or NVMe SSDs) and 10GbE+ networks.


For example, doing a hash verification on a 50GB file goes at only 350MB/sec on a Windows 10 VM with CPU utilization at only 10% when Crystal Disk Mark on the same system can hit over 2000MB/sec on sequential read and write when using 4 threads.  Even Windows 10 transfers are more than double that of Teracopy's.  CPU core counts are going up and up, 10GbE networks are becoming accessible and NVMe SSD performance are starting to hit the PCIe Gen3x4 bandwidth limit.  There should be a lot of room to optimize performance by better use of multi-threading.

While there's even a drop with NVMe based flash storage, IMHO that can be ignored sometimes, e. g. transferring a couple of big files and way more small files to the same target disk.


Maybe an UI switch, defaulting to off, would be sufficient to fulfill both types of users.


Anyway, I'm supporting your request for faster transfers - TeraCopy should, at the very least, be nearly on par with Windows's internal copy routine.

Avatar
stephane simonetti

When you add files from an existing copying session, the addition in KB/MB/GB is not correct, after copying the first add list, the progression bar is above 100% !   and you don't know when it's gonna be finished. i go back to the previous version which was working perfectly at this point.

Avatar
mow
Quote from Yuzhy

Keep in mind transferring multiple files simultaneously will turn a sequential read/write job into lots of random read/write jobs and your hard drives will become your bottle neck.


However I am finding even when transferring and verifying single large files Teracopy seems very slow on systems capable of Giga Bytes per second of sequential read/write (eg. by using large numbers of HDD in Raid arrays or NVMe SSDs) and 10GbE+ networks.


For example, doing a hash verification on a 50GB file goes at only 350MB/sec on a Windows 10 VM with CPU utilization at only 10% when Crystal Disk Mark on the same system can hit over 2000MB/sec on sequential read and write when using 4 threads.  Even Windows 10 transfers are more than double that of Teracopy's.  CPU core counts are going up and up, 10GbE networks are becoming accessible and NVMe SSD performance are starting to hit the PCIe Gen3x4 bandwidth limit.  There should be a lot of room to optimize performance by better use of multi-threading.

Keep in mind that, when using SSD drives, access times are negligible, so random read won't be much of a bottleneck. For really small files (under a filesystem block each), writing would actually not be that random because they'll be placed in adjacent blocks either way. This might even help the SSD writing complete (flash) blocks.


Also, when doing hash verification, Teracopy first hashes the source file and then the target file. If those are on different devices, both could be read and hashed in parallel.

Avatar
Code Sector
  • Planned
Avatar
Jay

Has this been implemented yet? I can't seem to find anything in the docs or on Ora about it and it is still labelled as planned after 5 years!??

Avatar
Martin
Quote from Jay

Has this been implemented yet? I can't seem to find anything in the docs or on Ora about it and it is still labelled as planned after 5 years!??

To be fair, the `Planned` flag was added only 2 years ago. Admitted, still quite a decent amount of time passed since then.
Avatar
berkkocaturk

Image 1356

It makes is extremely slow to check integrity of files and also because the target is much faster than the source it could copy files while hashing copied files

Windows knows which drives are ssd and when copying to ssd even multiple jobs should not wait for each other

Avatar
Code Sector
  • Started