Page 1 of 1

Pre-allocate disk space for all files

Posted: Sun Mar 06, 2016 9:43 am
by Robertomcat
Hello good day.

I am indebted for some time about doing "Pre-allocate disk space for all files". I am using the Linux Mint "Rosa" (ext4) operating system on a micro four-bay (xfs) server, where all the files are hosted qBittorrent. I wonder if when activating the pre-allocation of space, the files are stored all over the hard drive (fragmented data) or in a particular space where the entire file is next. Thank you very much and greetings!

Re: Pre-allocate disk space for all files

Posted: Sun Mar 06, 2016 11:00 am
by ciaobaby
I am indebted for some time about doing "Pre-allocate disk space for all files".
Indebted????

indebted means that you owe another entity something for a service or assistance that you received from them.

So you may be "indebted to myself" if I answer your query appropriately. :)

Anyhow;

Pre-allocate does NOT do what you think it does. The function of pre-allocate is to 'zero fill' the already allocated space, it does NOT prevent file fragmentation. The solution to file fragmentation on Linux systems is to NOT use NTFS [or FAT] as the disc filing system. ext2, ext3 & ext4  are more 'intelligent' than the Micro$oft filing systems, and WILL deal with any fragmentation (should it occur,)  as and when it does happen automatically, by moving the files around.
from HTG Defragment
How Linux File Systems Work
Linux’s ext2, ext3, and ext4 file systems – ext4 being the file system used by Ubuntu and most other current Linux distributions – allocates files in a more intelligent way. Instead of placing multiple files near each other on the hard disk, Linux file systems scatter different files all over the disk, leaving a large amount of free space between them. When a file is edited and needs to grow, there’s usually plenty of free space for the file to grow into. If fragmentation does occur, the file system will attempt to move the files around to reduce fragmentation in normal use, without the need for a defragmentation utility.

Re: Pre-allocate disk space for all files

Posted: Sun Mar 06, 2016 11:09 am
by Robertomcat
ciaobaby wrote:
I am indebted for some time about doing "Pre-allocate disk space for all files".
Indebted????

indebted means that you owe another entity something for a service or assistance that you received from them.

So you may be "indebted to myself" if I answer your query appropriately. :)

Anyhow;

Pre-allocate does NOT do what you think it does. The function of pre-allocate is to 'zero fill' the already allocated space, it does NOT prevent file fragmentation. The solution to file fragmentation on Linux systems is to NOT use NTFS [or FAT] as the disc filing system. ext2, ext3 & ext4  are more 'intelligent' than the Micro$oft filing systems, and WILL deal with any fragmentation (should it occur,)  as and when it does happen automatically, by moving the files around.
from HTG Defragment
How Linux File Systems Work
Linux’s ext2, ext3, and ext4 file systems – ext4 being the file system used by Ubuntu and most other current Linux distributions – allocates files in a more intelligent way. Instead of placing multiple files near each other on the hard disk, Linux file systems scatter different files all over the disk, leaving a large amount of free space between them. When a file is edited and needs to grow, there’s usually plenty of free space for the file to grow into. If fragmentation does occur, the file system will attempt to move the files around to reduce fragmentation in normal use, without the need for a defragmentation utility.
Hi mate, thanks for answering! Forgive me for putting words that are not appropriate, as is "indebted" ;D. The truth is that I am using Google translator, because I'm from Valencia (Spain) and I understand very little English. Thank you for your reply, now I have clear what it is the pre-assignment.

Re: Pre-allocate disk space for all files

Posted: Sun Mar 06, 2016 11:38 am
by ciaobaby
because I'm from Valencia (Spain) and I understand very little English
Yeah I guessed as much,  and I will assume that your English is going to be FAR better than my Spanish. :)

Re: Pre-allocate disk space for all files

Posted: Sun Mar 06, 2016 11:40 am
by Robertomcat
ciaobaby wrote:
because I'm from Valencia (Spain) and I understand very little English
Yeah I guessed as much,  and I will assume that your English is going to be FAR better than my Spanish. :)
Buffff, my engish is very very bad!!!  :'(

Re: Pre-allocate disk space for all files

Posted: Fri Mar 30, 2018 5:50 am
by 2family4jeff
I want to clarify and elaborate on a dangerous (and very common) misconception about Windows File Systems, and other file systems in general.  It is completely INCORRECT to state that Linux file systems (specifically EXT4, but the same can be said for EXT2, EXT3, ReiserFS, Solaris, and the various flavours of Journaling File Systems you are likely to encounter in the real world); is "MORE intelligent".  In fact, depending on your point of view, and personal tolerance for risk of data loss, NTFS is BY FAR the most intelligent file system out of them all.  There is ABSOLUTELY no comparison between NTFS and EXT4 or ReiserFS (better known as the file system on Mac OS X) when it comes to data protection and recoverability. 

The reason NTFS often seems to get more fragmented than EXT4 and the rest is because of a very unique and useful background feature that most people are either not aware of or do not fully appreciate, called "Shadow Copies".  In my opinion (and my opinion should count, since I have personally managed Exabyte level backup and recovery services to several hundred commercial clients with individual services for several thousand end users; at a rate of over 150 Million unique files processed per night to a single data center from 8 countries spread out over 3 continents).  Shadow Copies are the closest thing you can get to having a backup of all the files, even when you have no backup.

Let me explain the magic... when a file is first saved, the information is written to the largest available block of unused space.  NTFS breaks each file down into smaller units called "blocks", then writes the blocks as close together and in sequence as possible within that free space, and then if necessary writes the blocks that wouldn't fit into the next largest are of unused space, until the whole file is saved.  This is basically the exact same as EXT4.  The next thing it does is records the location of the file, the date it was saved, and a list of all the sectors where blocks of the file are saved.  Again, this is the same as EXT4, and ReiserFS (Mac OS X), and pretty much everything but FAT32 does the same thing at this stage.  BUT the real magic happens every time you open and modify the file.  Unlike the other systems, NTFS NEVER over-writes any block of data with new information if it doesn't need to.  So, if you save a document with 3 pages of data - then open it, delete your last paragraph, and then type two more pages; NTFS leaves the original data alone, and saves all of the changes as NEW blocks in the largest empty space it can find just like it did when the file was originally made.  Then, it goes to the index entry for that file, and it makes a new entry for the file with a new date, and new list of locations where the blocks are saved.  The old blocks are not listed on the new entry, so when you go to open the file, they are not loaded so you don't see the original information in the file anymore, you only see the blocks that were valid for the NEW date.  The OLD entry is moved to new index, called a "shadow copy" that sits on a different part of the drive.  NTFS continues this practice until all space is used up, and then begins deleting the oldest Shadow Copy indexes, which basically means that all the blocks of data listed in those indexes is now considered free space, and new files are saved there just as if it was actually empty space. 

This means 2 magical things... #1)  Let's say you accidentally deleted a file, or got a virus infection in a folder full of important files, or you just realized that you have made a horrible mistake after you re-write a chapter in your novel after you spent months making changes to it and want to go back in time and restore the original document before you started changing things...  well - don't bother trying to clean the virus, or digging through a backup disk, or turning to expensive services that recover things like this... just right click on the folder, click properties, click the Previous Versions tab, and choose the date you want from the list of available shadow copies, and click open... you are now looking at the folder in the exact state it was on that day, down to the block level.  If a file was infected last week, go back to last week plus 1 day, and get your uninfected copy.  No virus cleaning necessary- just delete the infected version, and restore the uninfected one from the Shadow copy.  You can do this with files, folders, and even whole disks.  The larger the size you want to restore, the less far back you can go.  Usually you can restore a whole drive (like your C drive, or your entire media drive, etc) up to about a month or maybe several weeks.  Folders, could go back several months.  Some files that you access a lot and change regularly, could go back YEARS in many cases).  And EACH "shadow copy" is stored in it's entirety - you don't have to pick which one you want, you treat each version as a seperate file.  And this doesn't even use up any of your disk space. 

The Windows System Restore feature (which many people are scared of, because of the OLD way it worked before Windows Vista/7 was prone to viruses and other issues) - in modern computers running Windows 7, 8, 10, or Vista, uses this feature of NTFS.  This means if your computer is infected badly with Cryptolocker - or an uncleanable virus that you can't clean.  You can run System Restore and just go back in time to before you got infected.  Instantly.  No reformatting. 

If you don't see the Previous Versions tab, (it's hidden on some Home versions of Windows), you can enable it by sharing the folder and browsing to it over a network, or use this trick - type \\localhost\C$ into your address bar) then right click on a folder and go to properties, and you will see the Previous Versions tab.  This works because originally this was developed for NAS storage, on big corporate networks.  But, it's so amazing it was quickly added to NTFS on regular PCs.

The second MAGIC trick this creates is that you can now copy, rename, move, or even edit files that are already open in another program.  NTFS just saves your changes to new areas, and marks the parts of the file that were locked as a new shadow copy.  That means you can backup your whole hard drive, while using the computer - even if you are editing files while the backup is running. 


The amount of fragmentation this causes is WAY less than people think it is.  On average, even a VERY fragmented NTFS disk on a computer that is used daily is less than 5% of the total size of the volume.  If you are torrenting like mad, and using 80-90% of your disk at all times, and constantly having to delete stuff to make room for new downloads... that's probably the worst case scenario for fragmentation, and STILL it would RARELY ever get to 20% fragmentation... and that would be EXTREME.  In terms of performance and efficiency, this would barely add a few nanoseconds to read and write operations on the fragmented files.  As in, or every 1 second a badly fragmented NTFS volume takes to access a file, a perfectly unfragmented EXT4 volume would need about 0.999995 seconds (that's hundredths of a milisecond difference) - so over an 8 hour day (28800 seconds)), the MOST extra time this would possibly take on NTFS would be 144ms.   

Are you REALLY willing to risk data loss - to save yourself half of a quarter of a second per day??

Re: Pre-allocate disk space for all files

Posted: Fri Mar 30, 2018 7:32 am
by Switeck
I've done a LOT of extreme speed tests with various BitTorrent clients:
index.php/topic,3956.0.html

I used ramdrives formatted with either FAT32 or NTFS. There was a small speed loss using NTFS vs FAT32 -- maybe 1-5%.
NTFS was slightly better at minimizing fragmentation than FAT32 when dealing with the same cluster size AND using pre-allocated files.

BUT...when using sparse files (which FAT32 does not support), it was awful.
Even though my tests almost always consisted of a single seed and a single peer -- with the peer downloading to a ramdrive that started out with minimal or no fragmentation.
"qBT create over 30,000 file fragments from a 14 file ~2.5 GB torrent. ~2400 per file! If I wasn't doing it on a ramdrive that's capable of 1 GB/sec even with tiny file fragments, it would've been very slow."

This is mostly due to libtorrent (which qBitTorrent, Deluge, Halite, and numerous other BT clients uses) having awful 16 KB chunk handling at least on Windows systems.
index.php/topic,3787.0.html
http://forum.deluge-torrent.org/viewtopic.php?t=27725
http://forum.deluge-torrent.org/viewtopic.php?t=36959 [NTFS] Full Allocation, Sparse Files and 20.000 Fragments

This:
https://github.com/qbittorrent/qBittorr ... -376228737
...goes into more detail about why/how it happens and looks like the next version or 2 of libtorrent will enable write coalesce on Windows.

A side-effect of this is also that misaligned multi-file torrents cause slow speeds - index.php/topic,2627.msg12725.html#msg12725

Using a very large cache to "paper over" some of these issues doesn't help and can cause other problems:
index.php/topic,2042.0.html