Suramya's Blog : Welcome to my crazy life…

June 8, 2021

Great book on Military Crypto analytics by Lambros Callimahos released to public

Filed under: Computer Security,Computer Software,My Thoughts,Techie Stuff — Suramya @ 9:58 PM

I find Cryptography and code breaking to be very interesting as there are huge implications on Cyber security. The current world is based on the presumption that cryptographic algorithms are secure, it is what ensures that we can use the internet, bank online, find love online and even work online. Cryptography historically has been a field working under heavy classification and there are multiple folks we don’t know about because their existence and work was classified.

Lambros Callimahos was one such Cryptologist, he was good enough that two of his books on Military Cryptanalytics covering code breaking (published in 1977) were blocked from public release till 1992. The third and last volume in the series was blocked from release till December 2020. It is now finally available for download as a PDF file so you can check it out.

The book covers how code breaking can be used to solve “impossible puzzles” and one of the key parts of the book is it’s explanation of how to use cryptodiagnosis to decrypt data that has been encrypted using an unknown algorithm. It has a whole bunch of examples and walks you through the process which is quite fascinating. I am going to try getting through it over the next few weeks if I can.

Check it out if you like to learn more about cryptography.

– Suramya

May 30, 2021

You can now run GUI Linux Apps on Windows 10 natively

Filed under: Computer Software,Linux/Unix Related — Suramya @ 10:17 PM

With the latest update of Windows Subsystem for Linux (WSL), you can now run Linux GUI applications on Windows natively. This is pretty impressive considering Steve Ballmer famously branded Linux “a cancer that attaches itself in an intellectual property sense to everything it touches” back in 2001. In just 20 years, Microsoft has changed it’s stance and started adding more Linux functionality to it’s operating system.

Arguably, one of the biggest, and surely the most exciting update to the Windows 10 WSL, Microsoft has been working on WSLg for quite a while and in fact first demoed it at last year’s conference, before releasing the preview in April… Microsoft recommends running WSLg after enabling support for virtual GPU (vGPU) for WSL, in order to take advantage of 3D acceleration within the Linux apps…. WSLg also supports audio and microphone devices, which means the graphical Linux apps will also be able to record and play audio.

Keeping in line with its developer slant, Microsoft also announced that since WSLg can now help Linux apps leverage the graphics hardware on the Windows machine, the subsystem can be used to efficiently run Linux AI and ML workloads… If WSLg developers are to be believed, the update is expected to be generally available alongside the upcoming release of Windows.

The feature is still only available in Windows 10 Preview Builds but is expected to be released for general use in the near future.

I would love to see the reverse being developed. The ability to install and run Windows applications on Linux natively / officially. There is Wine/Crossover but they don’t support 100% of the applications yet. It would be cool if MicroSoft contributes to either of the tools to allow people to run windows software on Linux.

I personally use Crossover to run the Office Suite and it works great for me (For the most part). The latest version supports Office 365 and most of it works fine except for Excel which still has a bit of a problem with large files but works otherwise. Which is why I also have Office 2007 also installed where Excel works without issues even with large files.

Compatibility with MS Office suite is why a lot of users don’t want to switch from Windows to Linux or Mac. OpenOffice/LibreOffice is great but the UI sucks and the files are not 100% compatible (atleast the last time I tried it, it wasn’t) so the files might not look the same as you expected when you share them with Office users.

Source: Microsoft doubles down on Windows Subsystem for Linux

– Suramya

May 20, 2021

Thoughts on NVIDIA crippling cryptocurrency mining on some of its cards

Filed under: Computer Security,Computer Software,My Thoughts,Techie Stuff — Suramya @ 8:11 PM

You might have heard the news that NVIDIA has added code to it’s GPUs that make them less attractive for cryptocurrency mining by reducing the efficiency of such computations using a software patch. On one side this is great news because it means that GPUs will be less attractive for mining and be available for gamers and others to use in their setup. However, I feel that this is a bad precedent being set by a company. In effect they are deciding to control what you do with the card after you have bought it. A similar case would be a restriction in your car purchase to stop you from using it on non-highway roads. Or to stop you from carrying potatoes in the trunk.

This all comes back to the old story about DRM and how it is being used to restrict us from actually owning a device. With DRM you are essentially renting the device and if you do anything that the owner corporation doesn’t agree with then you are in for a fun time at the local jail. DRM/DMCA is already being used to block farmers from fixing their farm equipment, medical professionals from fixing their health equipment and a whole lot more.

Cory Doctorow has a fantastic writeup on how DRM works and the problems caused by it. DRM does not support innovation, it actually forces status-quo because it is illegal to bypass it.

I have an old X-Box sitting in my closet collecting dust, I want to run Linux on it but that requires me to break the law because I would need to bypass the DRM protections in order to install a new OS. Today we are ok when they are blocking cryptocurrency, what if tomorrow the company gets into a fight with a gaming company and decides that they will degrade the game performance because they didn’t pay the fees for full performance. What if tomorrow they decide, to charge a subscription fee to get the full performance from the device? What is to stop them from degrading or crippling any other activity they don’t agree with whenever they feel like? The law is in their favor because of DRM, laws like DMCA (and other such laws) make it illegal to bypass the protections they have placed around it.

This is a slippery slope and we can’t trust the corporations to have our best interest at heart when there is money to be made.

There is more discussion on this happening over at HackerNews. Check it out.

– Suramya

May 17, 2021

IBM’s Project CodeNet: Teaching AI to code

Filed under: Computer Software,Emerging Tech,My Thoughts — Suramya @ 11:58 PM

IBM recently launched a new program called Project CodeNet that is an opensource dataset that will be used to train AI to better understand code. The idea is to automate more of the engineering process by applying Artificial Intelligence to the problem. This is not the first project to do this and it won’t be the last. For some reason AI has become the cure all for all ‘ills’ in any part of life. It doesn’t matter if it is required or not but if there is a problem someone out there is trying to apply AI and Machine Learning to the problem.

This is not to say that Artificial Intelligence is not something that needs to be explored and developed. It has its uses but it doesn’t need to be applied everywhere. In one of my previous companies we interacted with a lot of companies who would pitch their products to us. In our last outing to a conference over 90% of the idea’s pitched had AI and/or Machine Learning involved. It got to the point where we started telling the companies that we knew what AI/ML was and ask them to just explain how they were using it in their product.

Coming back to Project CodeNet, it consists of over 14M code samples and over 500M lines of code in 55 different programming languages. The data set is high quality and curated. It contains samples from Open programming competitions with not just the code, it also contains the problem statements, sample input and output files along with details like code size, memory footprint and CPU run time. Having this curated dataset will allow developers to benchmark their software against a standard dataset and improve it over a period of time.

Potential use cases to come from the project include code search and cloud detection, automatic code correction, regression studies and prediction.

Press release: Kickstarting AI for Code: Introducing IBM’s Project CodeNet

– Suramya

May 14, 2021

NTFS has a massive performance hit on Linux compared to ext4

Filed under: Computer Software,Linux/Unix Related,My Thoughts,Techie Stuff — Suramya @ 12:47 PM

NTFS has long been a nemesis of Linux. I remember in the 2000’s getting NTFS working on linux required so much effort and config changes that I stopped using it on my systems as FAT32 was more than sufficient for my needs at that time. Initially the driver was very unstable and it was recommended that you only use it for Read operations rather than Read/Write as there was a high probability of data corruption. That has changed over the years and the driver is stable. However, there is a massive performance hit when using NTFS vs ext4 on a Linux machine and I saw this when I tried using a NTFS partition on my laptop instead of ext4.

I have a 1 TB drive on my laptop along with a SSD. I dual boot the laptop (need it for my classes) between Windows & Debian and wanted to have all my files available on both OS’s. When I last tried this, ext support on Windows was not that great (and I didn’t feel like searching for options) so I decided to format the drive to NTFS so that I would have access to the files on both OS. The formatting took ages and once the drive was ready I was able to copy my files from the desktop to the laptop. While the files were being copied I noticed very high CPU usage on the laptop and the UI was lagging randomly. Since I was busy with other stuff I let it be and ignored it.

Yesterday I was trying to move files around on the laptop so that the root partition had enough space to do an upgrade and I again noticed that file copy and most of the disk operations were taking way longer than I expected. For example there would be a second of delay when I tried listing the directory when it had a lot of files. So, I decided to test it out. My data on the Laptop is an exact copy of the files on the Desktop. I timed the commands on the desktop with the same command on the laptop and there was a significant difference.

My desktop is obviously a lot more powerful than the laptop so I decided to try an experiment where I would run a command on the NTFS drive, then format the drive to ext4 and run the same command. (after copying all the files back). When I did this I saw that there was a massive difference in the time it took to run the command. On ext4 the command took less than 1 second (0.107s) whereas it took almost 34 seconds (33.997s) on NTFS parition. The screenshot for both commands are below:


du -hs command on a ext4 partition


du -hs command on a NTFS partition

That’s a ridiculous amount of difference between the two. So I obviously have to switch back to ext4 which brought us back in a full circle – I still needed to be able to access my files from Windows as well as from Linux. Decided to go a search on the Internet for options and found out that Windows 10 now lets you mount Linux ext4 filesystems in WSL 2. I haven’t tried it yet but I will test over the next few days once I am done with some of my assignments. If there is something interesting I will blog about it in the near future.

As of now, I am back to using ext4 on the laptop and the OS performance is a lot better.

Well this all for now. Will post more later.

– Suramya

April 30, 2021

Review and test of Fawkes: Software to protect your pictures from AI/Reverse searches.

Filed under: Computer Software,My Thoughts — Suramya @ 11:28 PM

Yesterday, I wrote about Fawkes & Photo Ninja which can be used to protect your photos from facial recognition models and reverse image searches. This is a very interesting field and I had mentioned about creating a service that does it for free instead of charging like what Photo Ninja is doing.

The first step to that is to check if the program (Fawkes) actually works the way it is supposed to, so I downloaded a pic from the internet (my profile pic on Twitter) and ran it through Fawkes. The program takes a while to run (~20 seconds per image) depending on the no of people in the photo. It detected the faces very reliably and modified the image. When using the default settings the output is saved as a PNG file but you can override it using a command line parameter. It requires you to provide the directory you want to run it against but if you don’t pass it the directory, it doesn’t give any errors. It took me a few mins to figure out what the issue was (yes, I know… My brain is tired). The command to run it in the current directory with debug (because I like seeing what the software is doing) is:

./protection --debug --directory .

I then took the resultant, file and searched for it via Google Images, Yandex and TinEye. None of them were able to find any results with the new image. So that part of the software works great. 🙂 Now coming to how the software modifies the image, I saw that it adds 2 rows of pixelisation to the image. First is near the hairline and cuts across the hair and forehead, and the second is near the chin and is about 5-10 pixels wide. It is clearly visible in larger photos, but when zoomed out it doesn’t look too jarring. Frankly it looks like the image got damaged and is kind of obvious when you look at it.

In my very basic tests it made the same change everytime so I have a feeling that we can train image recognition software to look for this modification and ignore it. It might be more powerful to put the modifications at random locations in the image (over the faces) that way it is harder to train the software to counter it. Plus if the visual noise section can be reduced it would be great. Maybe instead of a long blur that is noticeable we can try to do multiple small changes that change the pic without making it obvious that the image was modified.

Below are the two images, the original on the left and the modified version on the right.


Sample output of the Fawkes

I then looked at running this on my webserver, but due to the restrictions there I wasn’t able to get it to run. Although, to be honest I only tried for about 20-30 mins because I was tired. If I can’t get it to run on the server then the other option is that I run it on my home computer but I will need to look at that in more detail before I commit to making this site. I have a rough draft of the requirements and feature list but still looking at the options before I start working on it. It will be a good way to take my mind of what is going on in the world so that is good.

Well this is all for now. Will keep you posted on how this project goes.

– Suramya

April 8, 2021

Moving a Windows install to another drive on the same computer shouldn’t be this hard

Filed under: Computer Software,Linux/Unix Related,My Thoughts,Techie Stuff — Suramya @ 11:27 PM

I recently bought a new SSD drive for my Laptop because even after upgrading everything else (except the CPU) the system was still slow and looking at the process use I could see that it was waiting for disk read/write for the most part and that was causing the slowness. Once I got the new drive, I had to move the existing OS installs from the old disk to the new one. I have three operating systems (OS) on the disk: Windows, Debian and Kali. I need the windows OS for my classes (my proctored exams have to be taken on a windows machine) and others are for my tinkering and general use computing. The disk layout on the old drive was as follows:

root@Wyrm:~# fdisk -l
Disk /dev/sda: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: ST1000LM024 HN-M
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x0f04ad34

Device     Boot     Start       End   Sectors   Size Id Type
/dev/sda1  *         2048   1126399   1124352   549M  7 HPFS/NTFS/exFAT
/dev/sda2         1126400 102402047 101275648  48.3G  7 HPFS/NTFS/exFAT
/dev/sda3       102402048 135956479  33554432    16G 82 Linux swap / Solaris
/dev/sda4       135956480 468862127 332905648 158.7G  5 Extended
/dev/sda5       135958528 175017985  39059458  18.6G 83 Linux
/dev/sda6       175022080 237936641  62914562    30G 83 Linux
/dev/sda7       237940736 468862127 230921392   675G 83 Linux

I partitioned the new disk as a copy of the old drive, except for the data partition which was smaller as the disk was smaller. I used dd to clone each partition on to the corresponding new partition using the following command: (where sdb was the new drive).

dd if=/dev/sda1 of=/dev/sdb1 bs=2k

Once I copied the partitions over, all I had to do was refresh the GRUB boot loader config using the following command:

update-grub

After the config was updated, I was able to boot into Linux from both my Debian and Kali partitions on the new drive. However, that didn’t work for Windows. It gave be a screen-full of random characters like what you see when you try to open a binary file in a text editor and refused to boot. Thankfully I had not deleted the old windows partition so I was able to try a few more things, but *nothing* worked. Windows would just refuse to boot from the new drive. The only solution I found that could have potentially worked was a Paid software that supposedly allows you to clone your windows install on new disks/computers. Since I didn’t want to spend money on something I should have been able to do for free, I didn’t try it.

In the end after wasting a lot of time on this, I was tired of trying various things so just decided to reinstall windows on the new drive. It wasn’t a major loss because I didn’t have much data on Windows but I still dislike the fact that I had to do so just to put in a new drive. Imagine the hoops I would have had to jump if I wanted to move to a new computer. Actually I don’t have to imagine, I did jump thorough them when I moved my install from my old laptop to this one.

My linux install on the laptop is an exact clone of my desktop install. I used dd to create an image of my Linux install on the desktop and then wrote the image on the laptop. It worked perfectly fine at the first try. All I had to change was the hostname so that my DHCP server didn’t have a nervous breakdown but other than that everything worked without a single problem. Even the graphics drivers auto adjusted on the new machine. Imagine if we could do the same thing for a Windows install.

– Suramya

March 27, 2021

Outrun: Run a local command on a remote server

Filed under: Computer Software,Interesting Sites,Linux/Unix Related — Suramya @ 9:49 AM

A lot of times we have to run a command that requires a lot of processing power and is extremely slow on your local computer. I have faced this issue in the past and at times wished there was a way to push these commands to a remote machine with a more powerful CPU to run the command. Now, thanks to the efforts of Alexander Overvoorde (Overv), Jakub Wilk and Xiretza this is now possible. They have created a tool called Outrun which lets you execute a local command using the processing power of another Linux machine without having to install the command on the remote machine.


Sample Execution of ffmpeg on a remote server

The software does have a few limitations, but on the whole it is very cool:

  • We need to have root access on the remote server (or sudo access) as the system needs to run chroot on the remote server
  • Both client and remote server need to be on the same architecture, so you can’t set up a session from an x86 machine to an ARM machine. Which is unfortunate because the first usecase I had for this tool was to run software from the RaspberryPI on my server as and when it needed more processing power.
  • File system performance remains a bottleneck

Check it out if you need to run commands with more CPU cycles than what is available on the local machine.

Thanks to Hacker News for the initial link.

– Suramya

March 7, 2021

Syncing data between my machines and phones using syncthing

I have talked about how my Backup strategy has evolved over the years. I am quite happy with the setup I explained in my previous post except for one minor point. I still had to manually sync the data from my laptop, Jani’s laptop and my phone to my desktop manually. Once it is there on the desktop the various backup processes make sure that it is backed up and secure. The issue is that I still had to manually sync the data between the devices.

For my laptop, I used Unison to manually check for changes and then sync them over which works great but I had to ensure that the sync happened in the correct direction. For Jani’s laptop I mounted my drive on her computer over ssh using these steps and then running robocopy to copy the files over. This worked intermittently well. For some reason the system would refuse to overwrite changed files randomly with permission denied errors even when the permission was set to 777. The only way to fix was to delete all the files on my computer and then do a fresh sync. This worked, but was not userfriendly and required me to manually kick off a backup which I did infrequently. My phone on the other hand was backed up manually to my computer using sftp. This was very crumbersome and I really disliked having to do it.

I have in the past looked into various technologies that allow multiple devices to sync data with each other. Unfortunately, all of them required an external connection with a copy of the data being stored in the cloud. Since that was a show-stopper for me, I never got around to setting up my systems to automatically sync with each other. Then a few weeks ago, I came across this great article on how to create A Simple, Delay-Tolerant, Offline-Capable Mesh Network with Syncthing (+ optional NNCP). In the article John talked about Syncthing, which allowed him to create a local serverless, peer-to-peer, open source alternative to Dropbox that allowed his machines sync directly with each other without a server. In other words a perfect fit for what I wanted and needed to do. So I spent a little bit of time researching syncthing and then decided to take the plunge and setup my laptop and desktop to sync with each other. Before starting the setup I backed up all my data so that in case something went wrong I still had a backup. Thankfully nothing did, but it is always good to have a backup.

Syncthing’s installation is pretty simple for all major operating systems, except for iPhones which are not supported. In Debian, installation just required the following steps

  • Run the following commands to add the “stable” channel to your APT sources:
  • echo "deb https://apt.syncthing.net/ syncthing stable" | sudo tee /etc/apt/sources.list.d/syncthing.list
    curl -s https://syncthing.net/release-key.txt | sudo apt-key add -
  • Once you have added it, run the following command to install syncthing
  • sudo apt-get update
    sudo apt-get install syncthing

    Once the software is installed execute the syncthing binary. On my computer it is installed in /usr/bin/syncthing. Once the software starts, it will start the web interface automatically. There is also a Desktop application, but I prefer the web-ui. Instructions on how to configure the folders and nodes are available at the Getting Started Guide over on the project website so I am not going to repeat them here. Basically, you need to define the nodes and connect them to each other, if the devices are not added on both sites then the folders will not sync.

    The software has a cool feature of discovery, which makes it easy to add devices on a given node. As soon as you connect to the same network they detect each other and give you the option of connecting both. After the devices are connected, you configure the folder you want to sync and select the devices you want it synced with. The best part is as soon as you configure one node, the other nodes will get a message stating that Node 1 is attempting to share a folder with them. Clicking on accept, allows you to configure the folder path etc on the node and that’s it. The system will detect the files which need to get synced over and will copy them quickly. You can configure the sync to be bi-directional or one way. Most of the folders in my setup are set as that, the only exception are Jani’s files which is a one-way sync because I know that I am not going to modify the files on the server.

    Below is what the setup looks on my desktop, as you can see I am syncing data from 3 different computers/phones to it and the sync’s are really fast. I have copied files over to the folder on one computer and within minutes (depending on the size) they were replicated on the other computers/phone.


    My Syncthing setup

    I have the android client running on my phone as well, and it instantly syncs any new photos etc from my phone to the desktop. All I need to do is connect to the same LAN network (can be over wired or wireless) and the devices connect and sync automagically. There is an option to do so even over the WAN using relay server but since I didn’t want that I disabled it in the setup.

    Now all my data is synced to the desktop machine without me having to worry about anything or manually copying files around. Check it out if you want to sync your devices without using an external server.

    – Suramya

November 28, 2020

My Backup strategy and how it has evolved over the years

I am a firm believer in backing up my data, some people say that I am paranoid about backing up data and I do not dispute it. All my data is backed up on multiple drives and locations and still I feel that I need additional backup. This is because I read the news and there have been multiple cases where people lost their data because they hadn’t backed it up. Initially I wasn’t that serious about it but when I was in college and working at the helpdesk, a phd student came in crying because her entire PHD thesis was on a Zip Drive and it wasn’t working anymore. She didn’t have a backup and was basically screwed. We tried a bunch of stuff to recover the data but didn’t manage to recover anything. That made me realize that I needed a better backup procedure so started my journey in creating recoverable backups.

My first backup system was a partition on my drive called backup where I created a copy of all my important data (This is back in 2000/2001). Then I realized that if the drive died then I would loose access to the backup partition as well, and I started looking for alternatives. This is around the time when I had bought a CD Writer so all my important data was backed up to CD’s and I was confident that I could recover any lost data. Shortly afterwards I moved to DVD’s for easier storage. However, I didn’t realize till a lot later that CD’s & DVD’s start becoming unreadable quite easily. Thankfully I didn’t loose any data but it was a rude awakening to find that the disks I had expected to keep my data safe were starting to become unreadable within a few years.

I then did a bunch of research online and found that the best medium for storing data long term is still Hard Drives. I didn’t want to store anything online because I want my data to be in my control so any online backup system was out of the question. I added multiple drives to my desktop and started syncing the data from the desktop & laptop to the backup drive using rync. This ensured that the important data was in three locations at any given time: My Desktop, My Laptop and the Backup drive. (Plus a DVD copy that I made of all my data every year)

I continued with this backup strategy for a few years but then realized that I had no way to go back to a previous version of any given document, if I deleted a file or wanted to go back to an older version of a file I only had 24 hours before the changes were synced to the backup drive before it was unrecoverable. There was a case where I ended up having to dig through my DVD backups to find the original version of a file that I had changed. So I did a bit of research and found rdiff-backup. It allows a user to back up one directory to another and generates an incremental backup. So we can recover/restore files based on date range. The best part is that the software is highly efficient, once the initial backup is done it only transmits the changes to the files in subsequent runs. Now that I have been using it I can restore a snapshot of my data going back to 2012 quite easily.

I was quite happy with this setup for a while, but while reading an article on best backup practices I realized that I was still depending only on 1 location for the backup data (the rdiff-data snapshots) and the best practices stated that you should also store it in an external drive or offsite location to prevent viruses/ransomware from deleting backups. So I bought a 5TB external drive and created an encrypted partition on the same to store all my important data. But I was still unhappy because all of this was still stored at my home so if I had a fire or something I would still end up loosing the data even though my external drive was kept in a safe. I still didn’t want to store data online but that was still the best way to ensure I had offsite backup. I initially thought about setting a server at my parents place in Delhi and backup there but that didn’t work out for various reasons. Plus I didn’t want to have to call them and troubleshoot backup issues over the phone.

Around this time I was reading about encrypted partitions and came up with the idea of creating an encrypted container file to store my data and then backup the container file online. I followed the steps I outlined in my post How to encrypt your Hard-drive in Linux and created the encrypted container. Once I finished that I had to upload the container to my webhost since I had unlimited storage space as per my contract. Initially I wasn’t able to because they had restricted my account’s quota but a call to their customer support sorted it out after a bit of argument and explaining what I was doing. The next hurdle I faced was uploading the file to the server because of the ridiculously low upload speed I was getting from Airtel. I had a 40 mbps connection at the time but the upload speed was restricted to 1 mbps because of ‘reasons’. After arguing with their support for a while, I was complaining about it at work and one of the folks suggest I check out ACT Internet. I checked out their plans and was quite impressed with the offerings so I switched over to ACT and was able to upload the container file quickly and painlessly.

Once the container was uploaded, I had to tackle the next problem in the process which was on how to update the files in the container without having to upload the entire container to the host. I experimented with a few solutions and then came up with the following solution:

1. Mount the remote partition as a local mount using sshfs. I mounted the partition locally using the following command: (please replace with the correct hostname and username before using)

/usr/sbin/runuser -l suramya -c "sshfs -o allow_other @hostname.com:. /mnt/offsite/"

2. Once the remote partition was mounted locally, I was able to use the usual commands to mount the encrypted partition to another location using the following command:

/usr/sbin/cryptsetup luksOpen /mnt/offsite/container/Enc_vol1.img enc --key-file /root/UserKey.dat
mount /dev/mapper/enc /mnt/stash/

In an earlier iteration of the code I wasn’t using the keyfile so had to manually enter the password everytime I wanted to backup to the offsite location. This meant that the backup was done randomly as and when I remembered to run the command manually. A few days ago I finally configured it to run automatically after adding the keyfile as a decryption key. (Obviously the keyfile should be protected and not be accessible to others because it allows users to decrypt the data without entering a password.) Now the offsite backup runs once a week while the local backup runs daily and I still backup the Backup partition to the external drive as well manually as and when I remember to do so.

In all I was quite happy with my setup but then I was updating the encrypted container and a network issue made be believe that my remote container had become corrupted (it wasn’t but I thought it was). At the same time I was fooling around with Microsoft One Drive and saw that I had 1TB of storage available over there since I was a Office 365 subscriber. This gave me the idea of backing up the Container to OneDrive as well as my site hosting.

I first tried copying the entire container to the drive and hit a limit because the file was too large. So I thought I would split the file into 5GB parts and then sync them to OneDrive using rclone. After installing rclone. I configured it to connect to OneDrive by issuing the following command and following the onscreen prompts:

rclone config

I then created a folder on OnDrive called container to store the split files and then tried uploading a test file using the command:

rclone copy $file OneDrive:container

Where OneDrive is the name of my provider that I configured in the previous step. This was successful so I just needed to create a script that did the following:

1. Update the Container file with the latest backup
2. Split the Container file into 5GB pieces using the following command:

split --verbose -d -b5GB /mnt/repository/Container/Enc_vol1.img /mnt/repository/Container/Enc_vol_

3. Upload the pieces to Ondrive.

for file in `ls /mnt/repository/Container/Enc_vol_* |sort`; do  echo "$file";  /usr/bin/rclone copy $file OneDrive:container -v &> /tmp/oneDriveSync.log; done

This command uploads the pieces to the drive one at a time and is a bit slow because it maxes out the upload speed to ~2mbps. If you split the uploads and run the command in parallel then you get a lot faster speed. Keep in mind that if you are uploading more than 10 files at a time you will start getting errors about too many open connections and then you have to wait for a few hours before you can upload again. It took a while to upload the chunks but now my files are stored in yet another location and the system is configured to sync to Onedrive once a month.

So, as of now my files are backed up as following:

  • /mnt/Backup: Local Drive. All changes are backed up daily using rdiff-backup
  • /mnt/offsite: Encrypted Container stored online. All changes are backed up weekly using rsync
  • OneDrive: Encrypted Container stored at Microsoft OneDrive. All changes are backed up monthly using rsync
  • External Drive: Encrypted backup stored in an External Hard-drive using rsync. Changes are backed up infrequently manually.
  • Laptop: All Important files are copied over to the laptop using Unison/rsync manually so that I can access my data while traveling

Finally, I am also considering backing up the snapshot data to BlueRay disks but it will take time so haven’t gotten around to it yet.

Since I have this elaborate backup procedure I wasn’t worried much when one of my disks died last week and was able to continue work without issues or worries about loosing data. I still think I can enhance the backups I take but for now I am good. If you are interested in my backup script an extract of the code is listed below:

function check_failure ()
{
	if [ $? == 0 ]; then
		logger "INFO: $1 Succeeded"
	else
		logger "FATAL: Execution of $1 failed"
		wall "FATAL: Execution of $1 failed"
		exit 1
	fi
}

###
# Syncing to internal Backup Drive
###

function local_backup ()
{
	export BACKUP_ROOT=/mnt/Backup/Snapshots
	export PARENT_ROOT=/mnt/repository

	logger "INFO: Starting System Backup"

	rdiff-backup -v 5 /mnt/data/Documents/ $BACKUP_ROOT/Documents/
	check_failure "Backing up Documents"

	rdiff-backup -v 5 /mnt/repository/Documents/Jani/ $BACKUP_ROOT/Jani_Documents/
	check_failure "Backing up Jani Documents"

	rdiff-backup -v 5 $PARENT_ROOT/Programs/ $BACKUP_ROOT/Programs/
	check_failure "Backing up Programs"

	..
	..

	logger "INFO: All Backups Completed Successfully."
}

### 
# Syncing to Off-Site Backup location
###

function offsite_backup
{
	export PARENT_ROOT=/mnt/repository

	# First we mount the remote directory to local
	logger "INFO: Mounting External Drive"
	/usr/sbin/runuser -l suramya -c "sshfs -o allow_other username@remotehost:. /mnt/offsite/"
	check_failure "Mounting External Drive"

	# Open the Encrypted Partition
	logger "INFO: Opening Encrypted Partition. Please provide password."
	/usr/sbin/cryptsetup luksOpen /mnt/offsite/container/Enc_vol1.img enc --key-file /root/keyfile1
	check_failure "Mounting Encrypted Partition Part 1"

	# Mount the device
	logger "INFO: Mounting the drive"
	mount /dev/mapper/enc /mnt/stash/
	check_failure "Mounting Encrypted Partition Part 2"

	logger "INFO: Starting System Backup"
	rsync -avz --delete  /mnt/data/Documents /mnt/stash/
	check_failure "Backing up Documents offsite"
	rsync -avz --delete /mnt/repository/Documents/Jani/ /mnt/stash/Jani_Documents/
	check_failure "Backing up Jani Documents offsite"
	..
	..
	..

	umount /mnt/stash/
	/usr/sbin/cryptsetup luksClose enc
	umount /mnt/offsite/

	logger "INFO: Offsite Backup Completed"
}

This is how I make sure my data is backed up. All of Jani’s data is also backed up to my system using robocopy as she is running Windows and then the data gets backed up by the scripts I explained above as usual. I also have scripts to backup my website/blog/databases but that’s done using a simple script. Let me know if you are interested and I will share them as well.

This is all for now. Let me know if you have any questions about the backup strategy or if you want to make fun of me. 🙂 This is all for now. Will write more later.

– Suramya

Older Posts »

Powered by WordPress