ComputerBanter.com

ComputerBanter.com (http://www.computerbanter.com/index.php)
-   Storage (alt) (http://www.computerbanter.com/forumdisplay.php?f=34)
-   -   Why is a smaller folder taking longer to be backed up than a largerone? (http://www.computerbanter.com/showthread.php?t=173494)

Yousuf Khan[_2_] January 9th 18 07:36 AM

Why is a smaller folder taking longer to be backed up than a largerone?
 
Okay, so I do daily Macrium file backups of my entire User folder, with
monthly fulls at the beginning of each month. All except one specific
subfolder which takes longer than all of the rest of the User folder
combined to backup. So I've excluded it, and I back it up in a separate
backup job which only runs twice a month, because I can't afford the
time to run that daily.

The User folder stats (minus excluded folder) are this:

Total Number of Files: 342862
Total Size: 913.78 GB
Backup Completed Successfully in 05:02:40

Now the excluded folder stats are this:
Total Number of Files: 651658
Total Size: 1.57 GB
Backup Completed Successfully in 11:47:09

As you can see, an approximately 1 TB folder is fully backed up in about
a mere 5 hours, whereas a puny 1.5 GB folder (almost 600x smaller!) can
only be fully backed up in about 12 hours? The only difference is that
there are about twice as many files in the smaller folder than the
larger folder. Both folders are on the exact same physical drive (HDD,
not SSD). File system is NTFS. What could be causing such a drastic
slowdown?

Yousuf Khan

Paul[_26_] January 9th 18 08:15 AM

Why is a smaller folder taking longer to be backed up than alarger one?
 
Yousuf Khan wrote:
Okay, so I do daily Macrium file backups of my entire User folder, with
monthly fulls at the beginning of each month. All except one specific
subfolder which takes longer than all of the rest of the User folder
combined to backup. So I've excluded it, and I back it up in a separate
backup job which only runs twice a month, because I can't afford the
time to run that daily.

The User folder stats (minus excluded folder) are this:

Total Number of Files: 342862
Total Size: 913.78 GB
Backup Completed Successfully in 05:02:40

Now the excluded folder stats are this:
Total Number of Files: 651658
Total Size: 1.57 GB
Backup Completed Successfully in 11:47:09

As you can see, an approximately 1 TB folder is fully backed up in about
a mere 5 hours, whereas a puny 1.5 GB folder (almost 600x smaller!) can
only be fully backed up in about 12 hours? The only difference is that
there are about twice as many files in the smaller folder than the
larger folder. Both folders are on the exact same physical drive (HDD,
not SSD). File system is NTFS. What could be causing such a drastic
slowdown?

Yousuf Khan


For someone playing along at home, this is what you're looking at.
Dialog box and so on.

https://forum.macrium.com/Topic10553.aspx

Now, does it use VSS for that ? Probably.
To freeze the volume before starting the backup, such
that any files still changing, don't get their latest version
backed up. It will backup a copy of the file system frozen
in time, at the instant the VSS completes the "freeze".

Now, to examine files and their locations, should File Explorer
be involved ? No.

OK, how about:

https://msdn.microsoft.com/en-us/lib...(v=vs.85).aspx

// Find the first file in the directory.

hFind = FindFirstFile(szDir, &ffd);

// List all the files in the directory with some info about them.

do
while (FindNextFile(hFind, &ffd) != 0);

I don't think that goes near File Explorer. It shouldn't
be using a shell to do things like that. It should be
making a file system call, at a guess. The file system call
will be against the shadow volume identifier.

I've seen bugs in File Explorer, where at around 60,000 files
in a folder (individual frames from a movie), if you delete
a few files, Explorer rails the CPU on one core, for each
Explorer window where the bug is triggered. This prevents
Explorer from re-painting the window showing a few files
that you've deleted. One way to "escape", is to "drain" the
folder to zero files, delete the folder, and magically
Explorer recovers for the File Explorer window in question.

But that doesn't happen at the file system level.

*******

Small files, can be small enough to have their data payload
stored in the $MFT. This should not cause a problem.

*******

Windows Defender can be "scanning the ****" out of any file
your programs happen to touch. Doesn't matter that the files
are "just going into a backup". There is an Admin Powershell command
to turn off Windows Defender real time.

Set-MpPreference -DisableRealtimeMonitoring 1

It will probably keep that setting, until your next reboot,
or until you set it to zero again.

*******

To establish baseline performance, you can use Robocopy to copy the
files to a RAMdisk. That should expose any source HDD performance
anomalies.

And in the "not a fair fight" category, you can also try 7ZIP
and use "Store Mode", which does not compress a file tree when
creating an archive on another storage device. That should
return a much better result than your Macrium result.

It does take Macrium about a minute, to generate an index at
the end of the run, during which the disk light will stop
flashing. And then it writes out the index before completing
the backup.

*******

For the HDD, run HDTune 2.55 and benchmark the drive, to make
sure there aren't any "bad patches" evident. If the drive
had a lot of reallocations, that may account for a bit of
the trouble.

Paul

VanguardLH[_2_] January 9th 18 09:25 AM

Why is a smaller folder taking longer to be backed up than a larger one?
 
Yousuf Khan wrote:

Okay, so I do daily Macrium file backups of my entire User folder, with
monthly fulls at the beginning of each month. All except one specific
subfolder which takes longer than all of the rest of the User folder
combined to backup. So I've excluded it, and I back it up in a separate
backup job which only runs twice a month, because I can't afford the
time to run that daily.

The User folder stats (minus excluded folder) are this:

Total Number of Files: 342862
Total Size: 913.78 GB
Backup Completed Successfully in 05:02:40

Now the excluded folder stats are this:
Total Number of Files: 651658
Total Size: 1.57 GB
Backup Completed Successfully in 11:47:09

As you can see, an approximately 1 TB folder is fully backed up in about
a mere 5 hours, whereas a puny 1.5 GB folder (almost 600x smaller!) can
only be fully backed up in about 12 hours? The only difference is that
there are about twice as many files in the smaller folder than the
larger folder. Both folders are on the exact same physical drive (HDD,
not SSD). File system is NTFS. What could be causing such a drastic
slowdown?


Disable your anti-virus and redo the backup to see if the AV was getting
in the way. Do you have more than one anti-virus or anti-malware
program installed? Despite their claim (like MalwareBytes saying their
compatible with Avast), that's not true. If you install multiple
anti-virus/malware programs, only one should be active at a time (i.e.,
its real-time scanner is running) while the others should be quiescent.
The others should only be used as on-demand or manual scanners. I've
seen where conflicting AVs would trigger each other to scan the same
file that one had opened for inspection, the other detects the file
access so it scanned the file, and it went back and forth between the
AVs while accumulating thousands of file I/O in the first couple of
minutes all the while slowing the system.

You might also want to disable any anti-exploit programs. Took awhile
to narrow down but I found MalwareBytes AntiExploit was causing disk
slowdowns and instability or improper behavior in some software.
Microsoft's EMET can do the same thing. Malwarebytes used to say MBAE
was compatible with EMET but that wasn't true, either.

You might also want to disable all startup program (use msconfig.exe or
SysInternals' AutoRuns), reboot, and retest the backup to see if it
takes a reasonable time. If so, you have a conflict with one of the
programs you load on starting Windows or when you login to your Windows
account.

In-use files will cause a hang waiting for them to stop being in-use.
Have you tried logging under the Administrator account and then backing
up the C:\Users\youraccount folder?

Is VSC (Volume Shadow Copy) Service aka VSS enabled in your Windows? Go
into the services manager (services.msc) and check "Volume Shadow Copy"
service is set to Manual. That means any program that issues a call to
use that service will start that service. It does not need to have
Automatic startup since it is only needed when a program wants to use
it. You don't want it set to Disabled. Under its Dependencies tab,
check that the services on which it depends are also set to Manual or
Automatic startup.

In a command shell with admin privileges, run:

vssadmin list providers

You should see Microsoft's provider (service) and may see others added
by backup software. I don't have a paid version of Macrium Reflect,
just the free version, so I don't know if the paid versions have Macrium
adding their own VSS provider. Then run:

vssadmin list writers

You're probably not employing all of them but check each has:

State: stable
Last error: No error

If the output is blank, there is a registry problem regarding
permissions preventing vssadmin from reading those registry entries, or
you have a major problem with VSS.

Macrium has its Other Tasks menu under which there is "View Windows VSS
Events". Check if that shows any errors. As I recall, each backup will
not only have a branch showing the backup process but a sub-branch for
VSS where you can look for errors, too.

You would expect that your own %userprofile% profile would have all
privileges assigned to your Windows account. I've run into problems
under the $userprofile\AppData% folder. Apps write under there and
might change permissions so only they can access their data. You can
use AccessEnum from SysInternals to scan a folder and its subfolders to
list. In its default setup, it will show Read, Write, and Deny
privileges on the selected parent (root) folder and all subfolders.
That would be a start to determine if your Windows account (under which
the Macrium job executes) has read permissions to everything under the
root folder you specify in the Macrium backup job. I just ran it and
several folders were listed as "???" for read privilege (and were in the
%temp% folder which is, yep, under the AppData folder) despite I was
logged under my Windows account which is the Administrators security
group. I even saw a couple subfolders with "Access denied" for read
privilege. It would list privileges for the folder (with read and write
privileges) but then sometimes follow with a wildcarded path\* which I
don't know what it is reporting for that syntax. If I right-click on
path and select Properties, that dialog appears. If I right-click on
path\*, an error dialog pops up saying that I don't have permissions.
The Help - Contents dialog is too terse and doesn't mention what
path\* means in the parlance of this tool.

You didn't identify the problematic subfolder. Knowing it might
indicate what might be locking it up. In fact, all you say is "User"
folder. That could include all Windows accounts (C:\Users) instead of
just yours (C:\Users\youraccount). You also did not state if you are
doing an image backup for a logical file backup. I only have the free
version which only saves images. Maybe the paid version also lets you
do file backups.

What level of compression did you select in the Macrium backup job(s)?
The highest compression level will take a LOT longer to complete with
little gain in backup file size.

A couple of options increase the backup job time: verify backup and
verify file system. However, I recommend using both despite the backup
jobs take longer.

What priority was set for the backup job(s)? A low priority means
anything else will get a bigger chunk of the CPU's time.

You did not mention if you are using the free or paid version of Macrium
Reflect. The paid version has additional backup options. One, you can
do incremental backups. You did not specify if your backups are the
same type: full, differential, incremental. Two, there is a backup type
(synthetic) that merges incrementals and differentials into their parent
full (base) backup image. That will take a lot longer to meld those
files together than keeping them separate. From the manual:

When purging Incremental backups, if the backup set only contains a
Full backup followed by Incremental backups, then this option causes
the Full backup to be 'rolled forward' to create a Synthetic Full
backup. This is also known as Incremental Forever.

If you have ever merged a full backup with subsequent differential or
incremental backups, you know those take a while. The synthetic full
backup (merged full + its incrementals) occurs based on your retention
schedule.

And just in case the files in the problematic folder happen to be on a
bad spot on the hard disk (and the only reason why I left intact your
inclusion to cross-post to comp.sys.ibm.pc.hardware.storage since
everything else is a software issue), have you ran "chkdsk c: /r" yet?

Yousuf Khan[_2_] January 10th 18 07:08 AM

Why is a smaller folder taking longer to be backed up than alarger one?
 
On 1/9/2018 2:15 AM, Paul wrote:
Now, does it use VSS for that ? Probably.
To freeze the volume before starting the backup, such
that any files still changing, don't get their latest version
backed up. It will backup a copy of the file system frozen
in time, at the instant the VSS completes the "freeze".


Yup, VSS is involved in all of the Macrium backups. Macrium used to have
their own internal version of VSS in the past, but I don't know if that
still exists or not. Perhaps Macrium now believes that Microsoft's VSS
is fully reliable, so it doesn't need its own proprietary version?

Now, to examine files and their locations, should File Explorer
be involved ? No.


A piece of software that knows how to setup and use VSS should have no
excuse to do something as kludgy as calling File Explorer, rather than
making its own system call.

Windows Defender can be "scanning the ****" out of any file
your programs happen to touch. Doesn't matter that the files
are "just going into a backup". There is an Admin Powershell command
to turn off Windows Defender real time.

Set-MpPreference -DisableRealtimeMonitoring 1

It will probably keep that setting, until your next reboot,
or until you set it to zero again.


Yeah, no, I've long since excluded that folder from scans, ever since
the Windows 7 and Security Essentials days. I just took a look and the
exclusion is still in effect.

To establish baseline performance, you can use Robocopy to copy the
files to a RAMdisk. That should expose any source HDD performance
anomalies.

And in the "not a fair fight" category, you can also try 7ZIP
and use "Store Mode", which does not compress a file tree when
creating an archive on another storage device. That should
return a much better result than your Macrium result.


Hmm, those are interesting ideas.

It does take Macrium about a minute, to generate an index at
the end of the run, during which the disk light will stop
flashing. And then it writes out the index before completing
the backup.

*******

For the HDD, run HDTune 2.55 and benchmark the drive, to make
sure there aren't any "bad patches" evident. If the drive
had a lot of reallocations, that may account for a bit of
the trouble.


This used to run out of an SSD in the distant past, and it still took
hours to complete even back then! It was the only time I ever saw the
SSD activity time get to 100%! Then when it outgrew the SSD, I put it on
the HDD, not for performance reasons, but for capacity reasons.

Yousuf Khan

Yousuf Khan[_2_] January 10th 18 11:41 PM

Why is a smaller folder taking longer to be backed up than alarger one?
 
On 1/10/2018 1:08 AM, Yousuf Khan wrote:
On 1/9/2018 2:15 AM, Paul wrote:
To establish baseline performance, you can use Robocopy to copy the
files to a RAMdisk. That should expose any source HDD performance
anomalies.

And in the "not a fair fight" category, you can also try 7ZIP
and use "Store Mode", which does not compress a file tree when
creating an archive on another storage device. That should
return a much better result than your Macrium result.


Hmm, those are interesting ideas.


Okay, so I took your advice, and ran a Robocopy to a RAMdisk to see how
fast that goes. It still took over 2.5 hours just to copy to RAMdisk!

Robocopy's own output showed:
Total Copied Skipped Mismatch FAILED Extras
Dirs : 21 20 1 0 0 0
Files : 652802 652802 0 0 0 0
Bytes : 1.580 g 1.580 g 0 0 0 0
Times : 2:38:31 2:35:58 0:00:00 0:02:32


Speed : 181312 Bytes/sec.
Speed : 10.374 MegaBytes/min.

Strange observations: The total space of the files is 1.6 GB, while the
total allocated space is 3.2 GB. This makes sense, in that NTFS blocks
are probably not being fully utilized from file to file. However, the
strange thing I ran into on the RAMdisk is that I constantly kept
running out of space, even though I allocated more than 3.2 GB
everytime. I started out allocating 3.5 GB, then 5 GB, and then finally
8 GB, and then it finally worked! Right now, my 8 GB RAMdisk is showing
Used space of 5.9 GB, while Free space is 2.1 GB. So if the files are
only taking up 3.2 GB, then 2.7 GB of space is being taken up by
something else? NTFS metadata maybe?

Then I tried an archiving operation from the RAMdisk back to the HDD,
using the zip-store method, and that took 51:49 minutes. I assume it
will take Macrium about as long to archive off of the RAMdisk as a zip
archiver would, so I could conceivably save a ton of time just by
copying the folder to a RAMdisk and then archiving off of the RAMdisk.
But then I'd need to keep an 8GB RAMdisk in memory for those few times
that I need to archive this folder?!?

Yousuf Khan

Yousuf Khan[_2_] January 11th 18 12:04 AM

Why is a smaller folder taking longer to be backed up than alarger one?
 
On 1/9/2018 3:25 AM, VanguardLH wrote:
Disable your anti-virus and redo the backup to see if the AV was getting
in the way. Do you have more than one anti-virus or anti-malware
program installed? Despite their claim (like MalwareBytes saying their
compatible with Avast), that's not true. If you install multiple
anti-virus/malware programs, only one should be active at a time (i.e.,
its real-time scanner is running) while the others should be quiescent.
The others should only be used as on-demand or manual scanners. I've
seen where conflicting AVs would trigger each other to scan the same
file that one had opened for inspection, the other detects the file
access so it scanned the file, and it went back and forth between the
AVs while accumulating thousands of file I/O in the first couple of
minutes all the while slowing the system.


No, the AV has long since been excluded from checking this folder, way
back when I was still running Windows 7 or maybe even XP. I'm just using
Defender here, so it was disabled back in the Security Essentials days.

I don't run any other real-time malware defenders, I just run them on as
needed basis from time to time.

Yousuf Khan

Yousuf Khan[_2_] January 11th 18 12:08 AM

Why is a smaller folder taking longer to be backed up than alarger one?
 
On 1/9/2018 3:25 AM, VanguardLH wrote:
Is VSC (Volume Shadow Copy) Service aka VSS enabled in your Windows? Go
into the services manager (services.msc) and check "Volume Shadow Copy"
service is set to Manual. That means any program that issues a call to
use that service will start that service. It does not need to have
Automatic startup since it is only needed when a program wants to use
it. You don't want it set to Disabled. Under its Dependencies tab,
check that the services on which it depends are also set to Manual or
Automatic startup.

In a command shell with admin privileges, run:

vssadmin list providers


Yes, actually during testing I tried to create my own shadow volume, using:

vssadmin create shadow

But I found that this isn't supported under Windows 10's vssadmin, these
are the only commands supported:

---- Commands Supported ----

Delete Shadows - Delete volume shadow copies
List Providers - List registered volume shadow copy providers
List Shadows - List existing volume shadow copies
List ShadowStorage - List volume shadow copy storage associations
List Volumes - List volumes eligible for shadow copies
List Writers - List subscribed volume shadow copy writers
Resize ShadowStorage - Resize a volume shadow copy storage association

Macrium probably can create these volume shadows directly through API
calls, but it looks like Microsoft is disabling it's use through the
command-line.

Yousuf Khan

VanguardLH[_2_] January 11th 18 12:59 AM

Why is a smaller folder taking longer to be backed up than a larger one?
 
Yousuf Khan wrote:

On 1/9/2018 3:25 AM, VanguardLH wrote:
Is VSC (Volume Shadow Copy) Service aka VSS enabled in your Windows? Go
into the services manager (services.msc) and check "Volume Shadow Copy"
service is set to Manual. That means any program that issues a call to
use that service will start that service. It does not need to have
Automatic startup since it is only needed when a program wants to use
it. You don't want it set to Disabled. Under its Dependencies tab,
check that the services on which it depends are also set to Manual or
Automatic startup.

In a command shell with admin privileges, run:

vssadmin list providers


Yes, actually during testing I tried to create my own shadow volume, using:

vssadmin create shadow

But I found that this isn't supported under Windows 10's vssadmin, these
are the only commands supported:

---- Commands Supported ----

Delete Shadows - Delete volume shadow copies
List Providers - List registered volume shadow copy providers
List Shadows - List existing volume shadow copies
List ShadowStorage - List volume shadow copy storage associations
List Volumes - List volumes eligible for shadow copies
List Writers - List subscribed volume shadow copy writers
Resize ShadowStorage - Resize a volume shadow copy storage association

Macrium probably can create these volume shadows directly through API
calls, but it looks like Microsoft is disabling it's use through the
command-line.


Same help shown when I run "vssadmin /?" on Windows 7. You probably
need a server version of Windows to get the "create shadow" directive in
vssadmin (I found a forum thread where a respondent said only the server
version of Windows 10 supports the create directive in vssadmin), or
Microsoft expects admins to use PowerShell (with admin privs) to issue
VSS commands. The Home editions, like what I have at home, are
crippled.

https://mcpmag.com/articles/2015/12/...ow-copies.aspx
https://superuser.com/questions/1125...n-command-line

That mentions going to a drive's properties dialog and looking at the
Shadow Copies tab. Not there in my Win7 *Home* edition.

https://docs.microsoft.com/en-us/win...nds/diskshadow

I've seen several articles that mention using the diskshadow tool to
test VSS, like:

https://www.veeam.com/kb1980
https://support.intronis.com/Knowled..._Independently

More info at:

https://docs.microsoft.com/en-us/win...nds/diskshadow

However, the tool is not available on client (workstation) editions of
Windows. Looks like you're stuck using tools that issue the VSC API
calls or use a script in Powershell. Yeah, sucks.

VanguardLH[_2_] January 11th 18 01:17 AM

Why is a smaller folder taking longer to be backed up than a larger one?
 
Yousuf Khan wrote:

On 1/10/2018 1:08 AM, Yousuf Khan wrote:
On 1/9/2018 2:15 AM, Paul wrote:
To establish baseline performance, you can use Robocopy to copy the
files to a RAMdisk. That should expose any source HDD performance
anomalies.

And in the "not a fair fight" category, you can also try 7ZIP
and use "Store Mode", which does not compress a file tree when
creating an archive on another storage device. That should
return a much better result than your Macrium result.


Hmm, those are interesting ideas.


Okay, so I took your advice, and ran a Robocopy to a RAMdisk to see how
fast that goes. It still took over 2.5 hours just to copy to RAMdisk!

Robocopy's own output showed:
Total Copied Skipped Mismatch FAILED Extras
Dirs : 21 20 1 0 0 0
Files : 652802 652802 0 0 0 0
Bytes : 1.580 g 1.580 g 0 0 0 0
Times : 2:38:31 2:35:58 0:00:00 0:02:32

Speed : 181312 Bytes/sec.
Speed : 10.374 MegaBytes/min.

Strange observations: The total space of the files is 1.6 GB, while the
total allocated space is 3.2 GB. This makes sense, in that NTFS blocks
are probably not being fully utilized from file to file. However, the
strange thing I ran into on the RAMdisk is that I constantly kept
running out of space, even though I allocated more than 3.2 GB
everytime. I started out allocating 3.5 GB, then 5 GB, and then finally
8 GB, and then it finally worked! Right now, my 8 GB RAMdisk is showing
Used space of 5.9 GB, while Free space is 2.1 GB. So if the files are
only taking up 3.2 GB, then 2.7 GB of space is being taken up by
something else? NTFS metadata maybe?

Then I tried an archiving operation from the RAMdisk back to the HDD,
using the zip-store method, and that took 51:49 minutes. I assume it
will take Macrium about as long to archive off of the RAMdisk as a zip
archiver would, so I could conceivably save a ton of time just by
copying the folder to a RAMdisk and then archiving off of the RAMdisk.
But then I'd need to keep an 8GB RAMdisk in memory for those few times
that I need to archive this folder?!?

Yousuf Khan


Did you use a tool to check which, if any, of the source files have
alternate data streams (ADS)? The size of those alternate streams are
not reported in normal tools.

When you download a file from a web server, an ADS is added to the file
to flag the "zone" for that file was the Internet. That's why you can
get some security issues regarding files you downloaded versus files you
copied from storage media. The 26-byte ADS is named Zone.Identifier.
The ZoneID attribute just has a numeric value (3 = Internet).

An easy way to strip any ADS from files is to copy the files to a FAT
drive. ADS is only supported in NTFS, not in FAT. There are other
tools or tricks to strip ADS(es) from a file but that's the one that
comes to mind right now.

I've used Stream Armor to find files that have one, or more, ADS
attached to them; see http://securityxploded.com/streamarmor.php. It
knows some ADS types: favicon, Zone.Identifier, and text file (although
the content of the stream could be huge). If multiple ADSes, especially
unknown types, are attached to a file, it flags them as suspicious.

For example, I could send you a copy of notepad.exe which is 189KB in
size but it could have an ADS attached to it that is gigabytes in size.
Of course, however you get the file means the ADS has to be preserved.

https://www.owasp.org/index.php/Wind...te_data_stream

Yousuf Khan[_2_] January 11th 18 03:19 AM

Why is a smaller folder taking longer to be backed up than alarger one?
 
On 1/10/2018 6:59 PM, VanguardLH wrote:
Same help shown when I run "vssadmin /?" on Windows 7. You probably
need a server version of Windows to get the "create shadow" directive in
vssadmin (I found a forum thread where a respondent said only the server
version of Windows 10 supports the create directive in vssadmin), or
Microsoft expects admins to use PowerShell (with admin privs) to issue
VSS commands. The Home editions, like what I have at home, are
crippled.


I think I found an alternative way of doing it, it's built into
alternative Windows commands. As usual Microsoft doing random **** to
make life harder for no apparent reason.

https://superuser.com/questions/1125...n-command-line

You can get that through PowerShell commands:

powershell.exe -Command (gwmi -list
win32_shadowcopy).Create('E:\','ClientAccessible')

And even non-PS commands:

wmic shadowcopy call create Volume='E:\'

Yousuf Khan


All times are GMT +1. The time now is 04:00 AM.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.
ComputerBanter.com