Backup/Causes of data loss

Causes of data loss include device failure, software bugs, software lock-in, and human error.

Hardware failure

edit

The magnetic head that floats close to a hard disk's metal platters could crash onto it, the spinning motor's lubrication could wear down, a manufacturing defect could lead to early defect, or a storage controller could malfunction due to a voltage spike caused by a malfunctioning power supply or a lightning strike.

Since RAID storage systems are more complicated, they have more points of vulnerability, such as a malfunctioning controller that could damage data even if the individual drives are working well, or a drive failure poking holes over the entire data stored on the system due to lack of redundancy drives.

This happens because RAID systems typically stripe data across drives to multiply sequential transfer rates, since it allows the individual drives to work in parallel. This performance benefit comes at the cost of vulnerability to data loss, thus requires redundancy drives (or "pairity drives") so the user can swap out a defective drive, and data can be automatically restored on a functional drive. If affordable, RAID network attached storage is for convenient quick access to a large pool of data, but a separate cold storage copy that is rarely touched should be kept if the goal is long-term archival.[1][2]

The complexity of RAID storage systems also makes it more susceptible to maintenance errors.[3]

Software bugs

edit

Users have reported parts of their mobile phones' internal storage to becoming inaccessible,[4][5] folders disappearing during a move,[6] bogus software updates rendering devices inoperable,[7] faulty colourspace conversion causing the operating system to crash in perpetuity,[8] and all contents of their internal storage being deleted.[9]

Data stored on the memory card is less likely to get lost, since a memory card can be ejected and read externally, as well as being recoverable to some extent using forensic software.

Should a failure such as a system crash or exhausted disk space occur while a file is being written to, the file might be truncated (partially cut off) or blanked. If the file is still opened in the program (for example, text editor), it can be saved on a different device or partition to prevent loss.

Another way to prevent it is creating revisions in separate files by occasionally changing the file name while saving. This can be an incremental number or time stamp.

Failed writes

edit

Data can get lost from unsuccessful writes to storage media. This can be caused by a full operating system partition, by unplugging external storage such as USB sticks without ejecting them safely, or by a loose connector that could unexpectedly disconnect an external storage device from the computer.

To improve performance, many operating systems use a write buffer. This means newly added files appear completely written to programs so they can be accessed faster, even if they are not completely written.

On Windows, a safe ejection can be performed from a tray icon at the right end of the task bar. On Linux, a sync command forces finishing writes, and the umount and eject commands followed by the device name or mountpoint can be used to prepare the computer for a disconnection of the device. The device name and mountpoint can be obtained using the lsblk (list block devices) command.

Some software might not be able to handle a no-space-left condition and corrupt or blank a file it writes to. In addition to loss of work, this can lead to reset preferences since the configuration file might get corrupted or blanked.

In rare occasions, an operating system might be rendered unbootable if its partition is completely full.[10] Then, the only solution is to boot from an external operating system installed on portable storage media such as an external SSD, then manually move files off the operating system partition to a different partition or device that has free space.

Malicious software

edit

Malicious software, also known as malware, might damage your data while executing its payload. While early malware usually caused damage to entertain their creators, more modern malware known as ramsomware encrypts files and then demands a payment for decrypting the files.

If your data has been encrypted by ramsomware and you have no backup, the chance of getting it varies depending on the program. For more recent versions of ramsomware, there may be no chance of getting it back without paying the ramsom. Even then, you are not guaranteed to regain access to your files given that an anonymous cybercriminal gets to decide whether your files will be decrypted.[11][12]

Consider storing very important data on write-once media such as recordable optical discs (CD-R, DVD±R, BD-R), given that data on those media is inviolable by malware.

File system modifications

edit

Partition management

edit

Partition management is a highly risky task.

Before partitions on a data storage device are modified, a full-disk image backup should be taken, since any interruption in the process could destroy the file systems, making the files inaccessible. A full-disk image backup allows for recovery by writing the partitions in their state before the partitioning back to the device.[13][14]

File system repair tools

edit

Tools such as check disk on Windows and file system consistency check on Linux intend to repair problems with a file system that could arise after writing to disk was interrupted, which could, for example, lead to inconsistencies between the files and the space marked as used. On Windows, CHKDSK also renames and relocates files with names containing characters Windows considers invalid such as colons and question marks that might have been created on a Linux-based operating system.[15]

It is recommended to renew the backup before running such tools, since they are not guaranteed to repair a file system successfully and might cause collateral damage. File system repair can be thought of as a digital surgery. Similarly to partition management, an interruption in the process can lead to enormous damage.

However, on journaled file systems such as NTFS (default on Windows and widely pre-formatted on external drives) and ext4 (default on Linux) and UDF (optical discs), file system checking is usually not needed after a power interruption, and when it is, it is more reliable than on old-fashioned file systems such as FAT32 that lack a journal.

Human error

edit

Human errors that can lead to data loss include the deletion of files and folders that one thinks are already backed up but are not, mistyping a command, choosing a wrong device for writing a disk image to, and dropping a device from a height or into liquid.[16][17][18]

A damaged touch screen of a smartphone after a drop might be insensitive to touch, making it impossible access files. Even when the smartphone is connected to a desktop/laptop computer through a USB cable or wirelessly, the file transfer typically needs to be activated on the phone first, which is impossible if the touch screen is unresponsive.[19]

Before working with sensitive tools such as dd and ddrescue, it is recommended to renew the backup and to unplug any momentarily unneeded external storage devices to protect them from overwriting accidents.

Avoid working with sensitive tools when tired. Fatigue and sleepiness increase the chance of accidents occuring.

On defective devices, data stored on a memory card is likely retrievable externally, therefore storing data on a modular memory card protects against data loss. The most popular memory card format in portable devices is MicroSD.

Human error is also a risk to cloud storage services, albeit smaller than the risk of data loss on local data storage managed by an individual with little experience in using computers.[20] For example, Amazon claims their "Glacier" service is "99.99999999999% durable". That is 99 followed by eleven nines, meaning the chance for data loss is claimed to be one in a hundred billion. Even if this claim were true from a technical point of view, which is very unlikely, it does still not take into account that an angry employee could memory-hole data. Though unlikely, it can be safely assumed to be likelier than a hundred billionth.[21]

Vulnerable storage location

edit

Some people make the mistake of treating the internal storage of their computer or mobile phone as a permanent archive.[22][23][24][25]

There is nothing wrong about having redundant copies of files on such devices, and it is useful if accessed frequently. However, without an independent cold data backup, files stored on a device's internal storage are at risk of loss from any failure that would make the computer inoperable, ranging from malware, bogus updates, to component failure.

Some vendors of computers and smartphones let the user remotely erase their device if linked from an account, which is intended for use after physical theft. However, this leads to a risk of data being erased through a compromised account.[23]

Bad user interface design

edit

Bad user interface design such as the ability to delete photos through a side-swiping gesture on some mobile photo viewers, or the ability to delete history items with a press of the "delete" key without confirmation on the Firefox web browser could cause the inadvertent deletion of data.

Software lock-in

edit

People might habitually store data inside a locked-in location such as a mobile phone application's internal data, thinking it will always be there. In reality, access to that data could be denied with little warning through a software update that removes a feature, or a software bug.[26]

If data is stored in a locked-in location from where it can not easily be ported and is only accessible through specific software, the data will at most only be accessible for the lifespan of the one device it is stored on.

If an app is dependent on an online service, the unavailability or discontinuation of that service could make user data inaccessible, if the app is poorly designed to refuse starting if no connection to the service provider can be established.

Instances of software lock-in include:

  • Some mobile browsers like Samsung Internet for Android (earlier known as "S Browser") lock saved pages away in the /data/ directory, from where they can not be backed up or copied to other devices, except with root access, which is locked behind the bootloader by default, and unlocking that typically involves a full device erasure.[27]
  • In April 2019, software support for the "software as a service" Telekom Entertain 303 Media Receiver, a digital television apparatus, was deprecated. Since the operating system is software as a service, it stopped working as well, making recordings stored on the hard disk inaccessible. The USB port of the device could not be used for recording or file transfer, and the operating system could also not read USB sticks with existing media. It has no use beyond a 0.5A charger. Data on the device's hard disk was stored in a proprietary format.

The device was introduced in 2011, so people would lose up to eight years of recordings. But since data could not be moved anywhere else from the built-in 500 GB HDD of the device, people were forced to delete recordings anyway to be able to record new footage.[28]

Overreliance on online services

edit

While storing data both in an online service and locally reduces the risk of data loss compared to storing it only locally, people who have stored data into online services have repeatedly made the mistake of not retaining a local copy, which lead to them losing access to their data.[20]

For example, people uploaded home videos on YouTube without locally retaining the original files. Due to the possibility of videos being removed and accounts being terminated in error, and the difficulty of reaching a YouTube employee for review, a local copy of such videos should be kept.[29][30][31][32]

Another example is that Google frequently locks user accounts and demands users to disclose their phone number to regain access. In Germany, it is mandatory since July 2017 to identify oneself to obtain a mobile phone number.[33] Other countries have introduced similar laws over time.[34] So Google holds accounts hostage until one provides personally identifiable information.[35][36][37] Internet companies have disrespected users' privacy by using mobile phone numbers for commercial purposes in the past.[38]

An online platform might change its business model, resulting in a purge of prior data. For example, in 2022, Vimeo planned suspending accounts whose videos consumed excessive bandwidth.[39]

Among the largest data loss disasters on Internet social networking services occurred on MySpace. The majority of audio tracks uploaded to the site were lost, reportedly due to a maintenance error. The music tracks were listed but unplayable, with the earliest reports of tracks being unplayable dating back to December 2017.[40] Initially, the site's developers reported trying to solve the problem, presumably to delay an uproar, but later came to acknowledge that the data was permanently lost.[41] The incident was dubbed a "datapocalypse", a portmanteau from "data" and "apocalypse". Even after the web site spontaneously removed the user blogging feature along with existing user blogs without announcement in 2013, nearly five years earlier, some people were still not compelled enough to back up their data locally, resulting in them losing it.[42]

Cloud storage providers usually close long unused accounts to clear disk space for new users.[43] Cloud storage is for short-term convenience, not archival. Exceptions are dedicated paid services with this stated purpose. However, due to lack of control, even those do not replace a local copy.[2]

Synchronization of bad data

edit

A safe backup is independent from its source device. If a backup is automatically synchronized and there is no error detection mechanism, there is a risk of damaged data being synchronized.[2]

Loss of access credentials

edit

Data in encrypted archive files or file systems or on a cloud service is at risk of loss if the access credential, commonly a password, is lost. This is unlikely to happen over a short time span, but significantly likelier in absence in the long term.

As mentioned above, cloud storage providers purge long inactive accounts, meaning by the time one forgets the credentials to a long unused cloud account, it might not exist anymore.

Loss and theft

edit

Portable devices such as mobile phones that store data can be lost or stolen.

This risk is higher during trips, where one might, for example, leave a bag with personal belongings unattended in a crowded area.

The loss of data from such an occurance can be minimized by backing data up to a storage device left at the base (such as a hotel or rented appartment), and synchronized to a cloud storage service if the mobile data plan allows for it.

Although less likely, data could be lost to home burglary as well, especially when stored on the internal storage of expensive devices that are attractive to burglars due to their resale value. Data can only be protected from burglary by storing a copy in a separate location.[22]

Natural disasters

edit

In some locations of the world, people may have to be considerate about natural disasters. For example, a flood disaster could soak a basement, destroying all data storage except for optical discs. This can be prevented using water-tight bags or containers. A basement however is the least vulnerable to an earthquake due to being supported by the surrounding earth. An earthquake mainly targets the top of a house, which could be completely destroyed if the house collapses.

References

edit
  1. All of our data is GONE!Linus Tech Tips – January 4th, 2016
  2. 2.0 2.1 2.2 "Today I Lost 12 TB of Backblaze "Protected" family Photos". Reddit (/r/BackBlaze). 2020-06-07. Archived from the original on 2022-11-04. Retrieved 2023-12-03.
  3. "Scrub Your NAS Hard Drives Regularly if You Care About Your Data". Louwrentius.com. 2020-04-22. Retrieved 2023-01-21.
  4. Omnia 2 doesn't recognize internal storage! – XDA forum user WhiteRussianBC – September 2nd, 2010
  5. "My Memory" (8GB) on Ominia II (GT-I8000L) has disappeared! – XDA forum user mattbiondi – May 10th, 2010
  6. Photos disappeared after moving to another device folder – Gökhun Güneyhan – Google Photos community (April 24th, 2018; 57 replies)
  7. Apple iOS update BRICKS repaired iPhones after screen repair - Louis Rossmann (2018-04-10)
  8. How THIS wallpaper kills your phone (by Arun Maini, Mrwhosetheboss, 2020-06-04)
  9. I just deleted a random folder in my internal storage and it wiped my internal storage. What the heck just happened? – Reddit – /r/Android (January 11th, 2013)
  10. Das Apple Drama!! [VLOG 4] - Kelly MissesVlog (2015-01-01) - YouTube (8 minutes)
  11. "My company was hacked and years worth of files have been encrypted". Tech Support - Reddit. 2021-02-20. Retrieved 2024-07-20. After spending a month hiring companies to try and get it back they payed the ransom of a staggering 1500 dollars and got it all back. - the company had not renewed its backup for seven years, since 2014.
  12. The No More Ransom Project
  13. islaambaduk_ (2022-11-14). "3 years of data wiped clean*accidentally*". Reddit. Retrieved 2023-01-15. Was partitioning the drive in cmd and the shit happened
  14. éclairJess (2018-08-30). WICHTIG !!EINSELF! (wird später gelöscht). Internet Archive (in German). Retrieved 2024-03-20. (originally uploaded to YouTube) - Artist explains she accidentally lost a partition containing her works.
  15. Wikipedia: CHKDSK
  16. Alles weg - Hochformat (Simon Unge), January 29th, 2020 – smartphone (iPhone XS plus or iPhone 12 plus) stuck on boot screen after falling from a height of several metres onto a carpet. The carpet did cushion the impact, but not enough to prevent damage to the device.
  17. Prank geht schief – BibisBeautyPalace (Bianca Heinicke), December 15th, 2016 (at 2 minutes and 54 seconds) – smartphone (iPhone 6s) dropped into pool. Afterwards, it powered on but the touch screen was unresponsive to input.
  18. Smartphone fell into lake, resulting in the loss of all data: GO2mobile (2013-05-23). Sony Xperia Tablet Z - Eintauchen und Eintauschen (in German). Retrieved 2023-06-10. (at 3 minutes and 5 seconds)
  19. Get data from cracked Nexus 4 - Panomosh - January 5th, 2015 - XDA developers
  20. 20.0 20.1 Bruce Schneier (2014-09-25). "Security Trade-offs of Cloud Backup". Retrieved 2023-12-03.
  21. Henry Newman (2012-09-04). "Cloud Storage Marketing Hype: Amazon Glacier's Big Claims". Enterprise Storage Forum. Retrieved 2023-12-03.
  22. 22.0 22.1 davdreamer (2022-07-09). "Stolen laptop in Westmeath Ireland, Please Help!". Imgur. Retrieved 2023-01-10. the laptop has five years of photos as well as a book I had just finished writing!, "if I could hop in my time machine and back everything up I would."
  23. 23.0 23.1 Honan, Mat (2012-08-17). "Mat Honan: How I Resurrected My Digital Life After an Epic Hacking". Wired. ISSN 1059-1028. Retrieved 2023-02-06. "Had I been regularly backing up the data on my MacBook, I wouldn't have had to worry about losing more than a year’s worth of photos, covering the entire lifespan of my daughter, or documents and e-mails that I had stored in no other location." "And worst of all, my AppleID account was broken into, and my hackers used it to remotely erase all of the data on my iPhone, iPad, and MacBook."
  24. "Methodology to protect your data. Backups vs. Archives. Long-term data protection - Apple Community". discussions.apple.com. 2014-06-14. Retrieved 2023-02-06. "No computer, regardless of HD or SSD size is a data storage device, and should never be considered as such." "All collections of media files such as pictures, music, and videos, unless directly needed should be kept off the notebook and on an external hard drive or likewise."
  25. "Apple Stole My Music. No, Seriously". 2016-05-04. Archived from the original on 2016-05-05. I painstakingly imported from thousands of CDs and saved to my computer's internal hard drive.
  26. everyoneelsethatlive (2022-01-29). "Whatsapp "deleted" all my videos of a passed loved one. Panicking". Reddit. Retrieved 2023-01-14. A person close to me recently passed away and a lot of messages, voice messages, pictures and videos in our conversation were saved on the app and meant a lot to me. Today I noticed I couldn't play videos from our conversation (both the ones I sent and the ones he sent).
  27. The sad state of personal data and infrastructure - Karl Icoss, BeepB00p.xyz
  28. Schamberg, Jörg (17 January 2019). "Telekom zieht 2019 beim alten "Entertain" den Stecker". onlinekosten.de (in German). Archived from the original on 12 April 2021. Retrieved 3 October 2022.
  29. My youtube account was terminated for "nudity or sexual content". Worried about losing thousands of hours of private/unlisted videos for our business. – /r/PartneredYouTube – Reddit (May 12th, 2022)
  30. My channel got terminated without reason, (No strikes, no warnings) I worked 4 years on this . (September 23, 2021) ("I have old personal videos that I have not saved anywhere else in there saved as private and do not want to lose them.")
  31. Tweet from February 25th, 2022: "@TeamYoutube can you unban my old youtube account, it has alot of old video memories that i had not saved in my computer yet and all i would really want is to get them back[…]"
  32. 3 Things I Wish I Knew when I First Started on YouTube (at 01m02s) by Tim Schmoyer, Video Creators TV – honeymoon videos taken down due to copyrighted music in background.
  33. Pieruschka, Von Marius (5 August 2016). "Prepaid-Karten: Ausweispflicht ab 01. Juli 2017". 4G.de – Das offizielle Infoportal zum Thema 4G, LTE, HSPA+ etc. (in German). Retrieved 3 October 2022.
  34. "Timeline of SIM Card Registration Laws". Privacy International. 2019-06-11. Retrieved 2023-01-10.
  35. Tell HackerNews: Google requiring phone number to log into Chromebook – YCombinator – September 9th, 2018
  36. Google is forcing me to enter a phone number to login to my account – Reddit – /r/Privacy – January 7th, 2021
  37. Google, Stop Asking Me for My Phone Number – Austin – GroovyPost – May 14th, 2012 ("Despite skipping through the process multiple times, Google has continued to harass me with phone number requests. How many times will I have to login and skip it before Google gets the idea? Where is the opt-out button?")
  38. Ravie Lakshmanan (2022-05-26). "Twitter Fined $150 Million for Misusing Users' Data for Advertising Without Consent". Retrieved 2023-01-29.
  39. Sato, Mia (15 March 2022). "Vimeo is telling creators to suddenly pay thousands of dollars — or leave the platform". The Verge. Retrieved 18 June 2023.
  40. MySpace music profiles – Reddit post from December 9th, 2017: "I have music on a profile from when I was making music […] That is the only copy In Existence and would love to have it again for the memories"
  41. Brodkin, Jon (18 March 2019). "Myspace apparently lost 12 years' worth of music, and almost no one noticed". Ars Technica. Retrieved 24 September 2022.
  42. "MySpace Punishes Its Few Remaining Friends By Vanishing Their Blogs". TechCrunch. 12 June 2013. Retrieved 24 September 2022. (Livefyre comment archive, 2014)
  43. Novet, Jordan (23 February 2018). "Dropbox shows how it manages costs by deleting inactive accounts". CNBC. Retrieved 3 October 2022.