Suramya's Blog : Welcome to my crazy life…

October 4, 2022

Workaround for VPN Unlimited connection issues with latest Debian

VPN’s are a great way to ensure that your communication remains private when using a pubic internet connection such as when you are connected to an Airport or Coffee shop Wifi. Plus they are good for getting access when a site is blocked where you are, for example in India the main site for VLC Media player has been blocked for a while. I primarily use VPN Unlimited on all my systems as I have a lifetime subscription though I also have other VPN’s that I use sometimes.

Unfortunately, the native VPN Unlimited application for Linux has stopped working a while ago due to a compatibility issue with SSL. When I upgraded to the latest version of Debian back in July 2022 it suddenly stopped working with the following error message:

vpn-unlimited: symbol lookup error: /lib/ undefined symbol: EVP_CIPHER_block_size

Reinstalling the software didn’t resolve the issue and neither did a search on the internet help. When I reached out to support they told me that Debian 11 wasn’t yet supported and they didn’t have an ETA for the new version to be released. They did recommend that I manually create & download an openvpn config from their site that would allow me to connect to the VPN manually using OpenVPN instead of the App. Unfortunately, the config generated didn’t work either as it would fail to connect with the following error message in the logs:

Sep 21 02:56:55 StarKnight NetworkManager[1123]:  [1663709215.0845]vpn[0x559d7fc46900,833a72d8-a08a-474e-a854-c926cd6c694a,"VPN Unlimited"]: starting openvpn
Sep 21 02:56:55 StarKnight NetworkManager[1123]:  [1663709215.0847] audit: op="connection-activate" uuid="833a72d8-a08a-474e-a854-c926cd6c694a" name="VPN Unlimited" pid=2829 uid=1000 result="success"
Sep 21 02:56:55 StarKnight kded5[2780]: org.kde.plasma.nm.kded: Unhandled VPN connection state change: 2
Sep 21 02:56:55 StarKnight kded5[2780]: org.kde.plasma.nm.kded: Unhandled VPN connection state change: 3
Sep 21 02:56:55 StarKnight NetworkManager[233850]: 2022-09-21 02:56:55 WARNING: Compression for receiving enabled. Compression has been used in the past to break encryption. Sent packets are not compressed unless
"allow-compression yes" is also set.
Sep 21 02:56:55 StarKnight nm-openvpn[233850]: DEPRECATED OPTION: --cipher set to 'AES-256-CBC' but missing in --data-ciphers (AES-256-GCM:AES-128-GCM:CHACHA20-POLY1305). OpenVPN ignores --cipher for cipher negotiations.
Sep 21 02:56:55 StarKnight nm-openvpn[233850]: OpenVPN 2.6_git x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] [DCO]
Sep 21 02:56:55 StarKnight nm-openvpn[233850]: library versions: OpenSSL 3.0.5 5 Jul 2022, LZO 2.10
Sep 21 02:56:55 StarKnight nm-openvpn[233850]: WARNING: No server certificate verification method has been enabled. See for more info.
Sep 21 02:56:55 StarKnight nm-openvpn[233850]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Sep 21 02:56:55 StarKnight nm-openvpn[233850]: OpenSSL: error:0A00018E:SSL routines::ca md too weak
Sep 21 02:56:55 StarKnight nm-openvpn[233850]: Cannot load certificate file /home/suramya/.local/share/networkmanagement/certificates/E87E7A7D6DA16A89C7B4565273D3A792_hk_openvpn/cert.crt
Sep 21 02:56:55 StarKnight nm-openvpn[233850]: Exiting due to fatal error
Sep 21 02:56:55 StarKnight NetworkManager[1123]:  [1663709215.1095] vpn[0x559d7fc46900,833a72d8-a08a-474e-a854-c926cd6c694a,"VPN Unlimited"]: dbus: failure: connect-failed (1)
Sep 21 02:56:55 StarKnight NetworkManager[1123]:  [1663709215.1095] vpn[0x559d7fc46900,833a72d8-a08a-474e-a854-c926cd6c694a,"VPN Unlimited"]: dbus: failure: connect-failed (1)

After a little more back and forth with the support team (which was extremely responsive and quick) which in turn reached out to their developers we identified the issue with the OpenVPN config. The fix for this will be deployed to all their servers by the end of this month. In the mean time I was given a workaround that resolved the issue for me. To fix the issue add this line to your OVPN file under the VPN section:


More information on this is available in the OpenVPN forum. Keep in mind that this is not a really secure configuration and if you are working on something really top secret you should use another VPN till the issue is actually fixed instead of this workaround as it is not secure.

However, just wanted to share this here for others who might be having this same issue. Hope this helps.

– Suramya

August 8, 2022

Using Behavioral Biometrics for User Authentication as added security measures – Advantages and Disadvantages

Filed under: Article Releases,Computer Security,My Thoughts — Suramya @ 11:59 PM

In this paper we explore how users can be uniquely identified using biometrics other than fingerprints, facial recognition, iris recognition etc on a continuous basis. We explore the use to techniques such as typing style, computer use style to see if we can create a model to uniquely identify a user based on the way they type and use the computer. As this method allows a system to constantly reauthenticate a user based on characteristics that are almost impossible to fake we look at the complexity of how this can be integrated as a security measure for secure systems. We also look at the pros and cons of implementing this authentication mechanism and explore potential problems this system generates for the user and administrators. Specifically, we look at how the system would deal with users who are sick, under medication or stress that could change their usage patterns and is it worth the expense and privacy issues to implement such a system.

Introduction and background

User authentication is the process of verifying the identify of a user or process trying to access a system, online service, connected device, infrastructure resources etc. Traditionally authentication is done by having the user provide one or more of the following:

  • Something they know
  • Something they have
  • Something they are

Let’s look at each of these one by one. The oldest way of authentication to computer systems is using usernames and passwords. The first password protection system was implemented in 1961 by Fernando J. Corbató at MIT (Workos, 2020). This allowed the system to identify users based on a secret password that only they knew. The first set of passwords were stored in plain text, but then password encryption was implemented so that users could not read the passwords for other users.

However, passwords can be leaked or guessed. In the past few years there have been major leaks of authentication data which have been decrypted and sophisticated password crackers have been created that can crack passwords based on dictionary attacks and brute force attacks. To safeguard against this attack vector another authentication mechanism was created that authenticates users based on something they have with them. This can include hardware keys, smartcards etc and these hardware devices would contain an embedded certificate that can be used to uniquely identify the holder.

The final method of authentication is something you are, which is provided by Biometric authentication. Some of the biometric methods that can be used are fingerprints, hand geometry, retinal or iris scans, face scans, and voice analysis. Fingerprints, Face Scans and iris scans are the most widely used biometric method in use today.

Multifactor Authentication
When a system uses a combination of one or more of the authentication methods described in the previous section the system is said to be using Multi-factor Authentication (MFA). The key point to remember is that a system is only considered to be using MFA if the authentication factors are in at least two of the categories. So if the authentication mechanism uses a password and a second pin to authenticate, it won’t count as MFA because both are things that you know.

Weaknesses in the current User Authentication methods

The current user authentication methods have several weaknesses that make it easy for attackers to compromise and bypass the checks. Complex passwords are harder to crack or guess than simple passwords, but they are harder for users to remember. So, users tend to use the same passwords across multiple sites or use passwords that are simple to remember. Unfortunately, passwords that are simple to remember are also easy to guess.
Another risk is that an attacker can compromise a site or server using vulnerabilities in the OS, services or applications running on it. Once they have access, they can gain access to the stored passwords for all users and depending on the encryption scheme used the passwords for user accounts can be guessed quickly. This is an attack vector that has been seen frequently over the past few years with password lists for major sites such as LinkedIn (Morris, 2021) and Yahoo (Goel & Perlroth, 2016) etc being compromised and leaked.

Hardware tokens or smart cards can be cloned, copied or stolen. If the card is not deactivated when it is lost or stolen an attacker can use it to gain access to restricted resources. Tools to create copies of smartcards are available easily in the market (Benchoff, 2016) using which an attacker can clone the cards quickly.

Biometrics was touted as an authentication mechanism that is almost impossible to bypass but unfortunately the hype didn’t match reality. Fingerprint authentication systems have been compromised using copies of fingerprints lifted from glasses, door knobs etc transferred to jello, Glycerin and gelatin. (Barral & Tria, 2009)

Facial recognition systems have been fooled by photographs and cosmetics. Researchers have also used the StyleGAN Generative Adversarial Network (GAN) to create master faces that can be used to impersonate 40% of the population. (Shmelkin et al., 2021)

Voice authentication systems have been bypassed using voice recordings and AI based ‘deep fake’ technologies. Amazon recently showcased technology that allows Alexa to impersonate the voices of people based on a few minutes long voice recording of the person being impersonated.

Similar bypasses have been found for all authentication mechanisms in use currently and thus researchers have been exploring new authentication mechanisms which would be harder to bypass and fool. One such field being explored in behavioral biometrics and we will explore the field, it’s implications, the pros and cons of the tech in this paper.

Introduction to Behavioral Biometrics

Behavioral biometrics is the study and use of uniquely identifying and measurable patterns in human activities that can include keystroke dynamics, gait analysis, mouse use characteristics, signature analysis etc. The field postulates that a user can be identified based on these characteristics just as uniquely as they can be using physical biometrics.

Another advantage of using Behavioral Biometrics over physical biometrics is that it doesn’t require specialized equipment to collect the data. Data can be collected using existing hardware and only requires software analysis and processing which makes it cheaper to implement to a certain extent and we will look at this in more detail later in the paper.

Behavioral Biometrics can include the following:

Keystroke Dynamics:

According to the studies, if a group of users is asked to type the paragraph of text, each of them will type the text slightly differently with different delays between each character being typed, and different rhythms for the text. This allows a system to identify the user based on how they type including criteria such as:

  • The user’s typing speed
  • Time elapsed between each consecutive keystroke
  • The time that each key is held down
  • The frequency with which the number pad keys are used
  • The timing and sequence of the keys used to type a capital letter
  • The Error Rate in typing, such as using the Backspace keys and words repeatedly mistyped by the user.

As each person would type the password slightly differently the system can use it to identify the authorized user and block attackers who might have gained the password for a given user.

Cursor Movement:

This uses the tracking speed, clicks and path taken by the mouse cursor movement during use to create a profile for the active user. This would be useful if the user uses the same set of applications frequently, if they are using a varied set of applications that keep changing then this would not be accurate.

Finger pressure on keypad:

This analyses the pressure on the keyboard to create a user profile. This is a lot more relevant for mobile devices and other devices with a touchscreen interface as the allow us to capture pressure details easily without extra hardware.

Every person has a different way of standing and a sufficiently trained system can look for differences in how the person sits in front of the computer and their posture while using the system.


Gait analysis attempts to identify a person based on their walking style, which includes movements such as stride length, posture, and speed of travel etc.

Each of the methods we listed above can potentially be used to continuously re-validate a logged in user.

Historical use of Behavioral Biometrics for authentication

Historically, behavioral biometrics have been in use since the 1860s when experienced telegraph operators were able to identify individual operators by the way they would send the signals. In World war II allied officers used it to validate the authenticity of messages they received based on how they were sent. (Das, 2020) Similarly, other organizations used this ability as well as an extra validation layer when communicating instructions over telegraph.

The Military has also used gait recognition to identify imposters in their base who are trying to impersonate authorized personnel to gain access to sensitive information.

Current state & the Future for Behavioral Biometrics

The behavioral biometrics market revenue totaled ~US$ 1.1 Bn in 2020, according to Future Market Insights (FMI). The overall market is expected to reach ~US$ 11.2 Bn by 2031, growing at a CAGR of 23.6% for 2021 – 31. (Future Market Insights, 2021)

As we can see, an increasing number of institutes, financial companies, website owners are using behavioral biometrics in their systems to detect fraudulent usage. The Royal Bank of Scotland uses it to monitor visitors to their websites and apps, others use it in their applications to monitor and authenticate users as an extra verification layer. (, 2018)

With the increase in processing capacity, sensor sensitivity and processing algorithms systems can make more accurate identifications of individual users. This allows systems to detect bots, password sharing/compromise.

Ecommerce sites have increasingly started incorporating this technology into their setup to prevent fraud. It can also potentially allow systems to make an educated judgment about the visitor’s gender and age to show appropriate products.

Considering the advantages and minimal hardware investment we will only see an increase in the use of Behavioral Biometrics for authentication in the future.

Advantages of using Behavioral Biometrics for authentication

Behavioral Biometrics have the following advantages that make them attractive for companies and institutes to implement:

  • Flexibility: The data being analyzed is not limited to currently identified sets that we have discussed so far. Since most of the processing being done is on the software side the organization can easily add additional behavioral data to be analyzed and processed.
  • Convenience: This a major plus point for the technology is that it is a passive layer of security. This allows it to work without interfering with the user workflows. This removes a major obstacle in incorporating security into the system as the traditional security setups decrease the usability of the system.
  • Efficiency: They can be applied in real-time to detect fraudulent use and the system can be run against historic data as well to detect improper use after fact.
  • Security: Behavioral characteristics are hard to replicate and thus incorporating this additional layer of security improves the security of the system.

Disadvantages of using Behavioral Biometrics for authentication

As with all systems there are some disadvantages of using a Behavioral Biometric system for authentication as well. If we are using the Keystroke analysis then the text being entered has to be long enough for the system to generate a profile and match it so if we are only using it as an additional validation step during password entry and the user’s password is too short, then the system might not be able to create and match a profile.

Another problem is that a user’s behavior can change drastically due to various valid reasons and that can cause access issues when the algorithm is unable to account for the changes. Some of the reasons can include:

  • Illness or Injury: If a person is injured or unwell then their usage patternswill change
  • Stress
  • Pregnancy
  • Sleep deficiency
  • Caffeine deficiency or overindulgence
  • Tiredness: If a user logs back in after a session in the gym their usage patterns are going to differ from the pattern before their gym session
  • Time of day: Some people are more active during certain times of day so their usage patterns will vary based on the time of the day.
  • Distractions: If the user is distracted while working , or example, if they are on a call and working at the same time. Their behavior patterns will be different.
  • Location: If the person logs in from a different location and are working with a different setup their metrics are going to be different. For example the profile when using an egronomic keyboard in office vs using a laptop keyboard while working remotely will be drasticly different and the system will have a hard time creating a consolidated profile for such users.

Another major issue with this technology is the Privacy implications. If we are implementing a system that monitors every keystroke and mouse movement and logs it for analysis then that has a serious privacy implication as sensitive data that shouldn’t be logged such as medical information, personal account passwords, other sensitive information etc can get logged as well. Once the data is logged there is a possibility of data leaks or a breach of the security system which would expose the collected information to an attacker.

Depending on the user’s location collection of this kind of data can be illegal due to rules such as the GDPR (Krausová, 2018), the California Consumer Privacy Act (CCPA) and other such rules. They will also limit the information that can be transmitted across state & country boundaries which can be a concern for multinational companies.

Finally incorporating the processing required for behavior analysis on the local system can be resource intensive which might make the setup infeasible for older machines. If the processing of the data is consolidated at a central location then the usage data would need to be transmitted to the location over the network that can potentially max out the bandwidth and depending on network congestion cause unacceptable delays in the processing and access.

Results and Recommendations

Based on our review of the current state of Behavioral Biometrics in the industry and the technological state of the system/algorithms we find that the technology does help increase the security of the system by adding an additional layer of security to the system. However, it is not yet mature enough to deploy for general commercial implementation and should only be used for securing highly sensitive systems and infrastructure where the security considerations outweigh the limitations identified earlier in the paper.
Once the technology is more mature and the issues identified earlier have been mitigated it can slowly be incorporated in the general computing world as an optional additional layer of security. At no point should this be used as the only layer of security for any system.


Behavioral Biometrics as a security measure is a technology still in its early stages of use and implementation and while it does add an additional layer of security the current limitations do not justify a general release and implementation in general use computing. The system should only be implemented in systems such as classified military systems, critical corporate servers containing highly sensitive information etc where the benefits or security concerns outweigh the disadvantages of using a technology that still needs to mature more.


Alzubaidi, A., & Kalita, J. (2016). Authentication of smartphone users using behavioral biometrics. IEEE Communications Surveys & Tutorials, 18(3), 1998–2026.

Araujo, L. C. F., Sucupira, L. H. R., Lizarraga, M. G., Ling, L. L., & Yabu-Uti, J. B. T. (2005). User authentication through typing biometrics features. IEEE Transactions on Signal Processing, 53(2), 851–855.

Banerjee, S. P., & Woodard, D. (2012). Biometric authentication and identification using Keystroke Dynamics: A survey. Journal of Pattern Recognition Research, 7(1), 116–139.

Barral, C., & Tria, A. (2009). Fake fingers in fingerprint recognition: Glycerin supersedes gelatin. Formal to Practical Security, 57–69.

Benchoff, B. (2016, January 18). Emulating and cloning smart cards. Hackaday. Retrieved June 27, 2022, from

Bo, C., Zhang, L., Li, X.-Y., Huang, Q., & Wang, Y. (2013). Silentsense. Proceedings of the 19th Annual International Conference on Mobile Computing & Networking – MobiCom ’13.

Das, R. (2020, October 14). A behavioral biometric – keystroke recognition. A Behavioral Biometric – Keystroke Recognition.
Future Market Insights. (2021, October). Behavioral biometrics market. Future Market Insights.

Goel, V., & Perlroth, N. (2016, December 14). Yahoo says 1 billion user accounts were hacked. The New York Times.

Krausová, A. (2018). Online behavior recognition: Can we consider it biometric data under GDPR? Masaryk University Journal of Law and Technology, 12(2), 161–178.

Morris, C. (2021, June 30). LinkedIn data theft exposes personal information of 700 million people. Fortune. (2018, August 15). What’s behind the rise of behavioral biometrics? Retrieved June 27, 2022, from

Shmelkin, R., Friedlander, T., & Wolf, L. (2021). Generating master faces for dictionary attacks with a network-assisted Latent Space evolution. 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021).

Workos. (2020, September 5). A developer’s history of authentication – WorkOS. A Developer’s History of Authentication.

Note: This was originally written as a paper for one of my classes at EC-Council University in Q2 2022.

– Suramya

August 6, 2022

Post Quantum Encryption: Another candidate algorithm (SIKE) bites the dust

Filed under: Computer Security,Computer Software,Quantum Computing — Suramya @ 8:23 PM

Quantum Computing has the potential to make the current encryption algorithms obsolete once it gets around to actually being implemented on a large scale. But the Cryptographic experts in charge of such things have been working on Post Quantum Cryptography/Post Quantum Encryption (PQE) over the past few years to offset this risk. SIKE was one of KEM algorithms that advanced to the fourth round earlier this year and it was considered as an attractive candidate for standardization because of its small key and ciphertext sizes.

Unfortunately while that is true researchers have found that the algorithm is badly broken. Researchers from the Computer Security and Industrial Cryptography group at KU Leuven published a paper over the weekend “An Efficient Key Recovery Attack on SIDH” (Preliminary Version) that describes a technique which allows an attacker to recover the encryption keys protecting the SIKE Protected transactions in under an hours time using a single traditional PC. Since the whole idea behind PQE was to identify algorithms that are stronger than the traditional ones this immediately disqualifies SIKE from further consideration.

Abstract. We present an efficient key recovery attack on the Supersingular Isogeny Diffie–Hellman protocol (SIDH), based on a “glue-and-split” theorem due to Kani. Our attack exploits the existence of a small non-scalar endomorphism on the starting curve, and it also relies on the auxiliary torsion point information that Alice and Bob share during the protocol. Our Magma implementation breaks the instantiation SIKEp434, which aims at security level 1 of the Post-Quantum Cryptography standardization process currently ran by NIST, in about one hour on a single core.

The attack exploits the fact that SIDH has auxiliary points and that the degree of the secret isogeny is known. The auxiliary points in SIDH have always been an annoyance and a potential weakness, and they have been exploited for fault attacks, the GPST adaptive attack, torsion point attacks, etc.

This is not a bad thing as the whole testing and validation process is supposed to weed out weak algorithms and it is better to have them identified and removed now than after their release as then it becomes almost impossible to phase out systems that use the broken/compromised encryption algorithms.

Source: Schneier on Security: SIKE Broken

– Suramya

June 5, 2022

Hacking a computer using Ham radio transmissions is now possible!

Filed under: Computer Security,Computer Software,Tech Related — Suramya @ 11:59 PM

Hacking a computer by getting them to listen to a Ham Radio station broadcast seems like the plot of a bad movie or TV series about ‘hackers’ but this is not a fictional story. It is now in fact possible to hack a WinXP & Windows 10 computer over the air, All we need to do is ensure that the target is using WinARPS on their computer to listen to the broadcast and then they are fair game.

I am in awe of this finding because figuring out how to generate radio packets that will cause a memory overflow/corruption and then figure out who to generate the packets in a way that allows you to get RCE (Remote Code Execution) requires phenomenal hacking skills and understanding of the underlying systems.

WinARPS is unlikely to get a fix for the issue because the author no longer has an environment to build/test the software as the last update to the code was back in 2013. However the author is aware of the problem and who knows they might get the environment working again and fix the issue.

Video demo of the issue on a Windows 10 machine (Credit:

This bug does show us that we can have the world’s most protected / isolated system but if there is any way to get external information/input then the system can potentially be attacked.

You can read the full walk through of the process at: Hacking Ham Radio: WinAPRS – Part 5

– Suramya

May 5, 2022

Thoughts around using GPS tracking to stop car thieves

Filed under: Computer Security,My Thoughts,Tech Related — Suramya @ 2:56 PM

Earlier today, I saw the following tweet Retweeted by the BengaluruCityPolice where they recommend that we install a hidden GPS tracker in the car that can be used to find the car if it is ever stolen.

On the surface this sounds like a great idea but there are larger implications that we are missing here. But first lets talk about why this wouldn’t work for long:

  • The thief’s are not fools, once this technique starts getting more popular the first thing they will do is search the car from top to bottom to find and remove the tracker.
  • If the car is underground or behind concrete/metal then the GPS tracker will not be able to transmit. So no signal.

There are other reasons as well but these are the top two that make the tracker useless. Now let’s look at the drawbacks shall we:

Once we have a GPS tracker in the car, all movement information of the car is now tracked and stored online. The current data privacy laws in India allow cops or others to get access to this data fairly simply. This data can also be sold to others (after anonymizing it) but it is quite simple to de-anonymize a dataset as proven by various people recently, such as the case last year where a Priest was outed as a user of Grindr app due to data de-anonymizer.

This is especially risky for women as this potentially allows people to figure out where they live or work, what their schedule looks like etc. Another problem is misuse of data by the company hosting it. History has shown that insiders at companies that store private data have used their access to view private details. This includes cops, tech employees etc. So the more data that is stored the more risk of data misuse and this doesn’t take into account the possibility of attackers hacking into the network to steal the movement data.

Once people have the data, it can then be used for many things such as:

  • Abusers can track their victims (wives/kids)
  • Identify who is having an affair with whom (Uber did this)
  • Figure out who is undergoing medical treatments
  • Criminals can see when we are on vacation and the house is empty.
  • Locate people who are traveling home at late night through empty areas
  • Employers could begin tracking employees to see if an employee is thinking about leaving by looking at visits to competitor’s office etc

These are not theoretical concerns there are been proven cases for each of the above. The risk is grave enough that the US Women’s Law Organization, which deals with a lot of domestic abuse cases has a whole section dedicated to GPS monitoring abuse.

We need to look at all aspects of the technology before we start implementing on a large scale. This includes looking at how the tech could potentially be misused.

– Suramya

April 29, 2022

Malware in Windows: TPM Bypasses & Firmware level persistence

Malware is the short form for Malicious Software and is basically software that allows attackers to infect a computer system or device to steal information, disrupt operations or gain access to sensitive data. It is a general term that includes viruses, worms, trojans, spyware, rootkits etc. (Cisco, 2021)

Conceptually the foundations for creating malware were laid almost simultaneously with the creation of the first computers. In 1951, John von Neumann proposed methods on how to create self-replicating automata (Neumann, 1951) and a few years later in 1959 Lionel Penrose published his paper on ‘Self-Reproducing Machines’ this paper was used as the basis for creating replicating machine code that were the basis of the later generations of malware. In 1970’s the creeper virus infected the ARPANET (Milošević, 2013) followed shortly after by Rabbit (Milošević, 2013) which spread rapidly to computers and created copies of itself overloading the machine and impacting system performance. (Milošević, 2013)

In the 1986, the first malware called Brain.A that targeted the PC platform was released. (Milošević, 2013) It used floppy disks as the infection mechanism by infecting the boot sector of every floppy disk used in an infected computer. Other viruses of the time used similar mechanisms to propagate and were quite prevalent by the measures of the time. Once Microsoft Windows was released viruses were created that targeted the new operating system with WinVir being the first virus for the new operating system, it gained persistence by modifying the Windows Executable files. (Milošević, 2013) It spread to new systems over floppy disks.
For almost a decade, infected disks and CD’s remained the primary method of infection for computers. In 1998 this changed with the release of Happy99 in late 1998 that spread via email attachments. Another popular vector for virus infections was macro viruses that infected Microsoft word files which were shared frequently with other users allowing the virus to spread. With the increasing popularity of the Internet, the new malware created during this time leveraged the internet as a transmission vector.

In early 2000, Code Red worm was created that leveraged vulnerabilities in the IIS webservers to propagate. (Milošević, 2013) This opened a new infection vector where the malware would scan for and exploit systems running vulnerable software.

Over the years, malware has become more and more common and has evolved to gain persistence using multiple methods such as using rootkits to infect the OS kernel and other such methods. The one constant throughout the years was that we could clean up a malware infection by formatting the infected drive and restoring from a clean backup. As long as the backup and the installation media were clean we could be confident that the infection was cleared.

Unfortunately, this is no longer the case with new strains of malware using sophisticated techniques to gain persistence using the computer firmware.

A. UEFI malware – The early years

UEFI rootkits were referenced in various leaks and were considered mostly theoretical. The Hacking Team referenced something called ‘rkloader’ in their internal presentations and the Vault7 leaks referenced ‘DerStarke’ which was an EFI/UEFI boot implant. But there was no real evidence of these being used so they were considered mostly theoretical for the most part.

This changed in 2018 when the first rootkit that leveraged the UEFI to achieve persistence was discovered. This malware called Lojax was created by the Sednit APT group. It used a malicious UEFI module written into the SPI flash memory to ensure that it was able to execute malware during the boot up process. (ESET Research, 2018)

B. UEFI Malware – Infecting SPI flash memory

The LoJax malware used the kernel driver RwDrv.sys to access the UEFI settings. The driver is distributed with RWEverything, a freeware utility that can read the BIOS information in most computers. (ESET Research, 2018)

The malware used this driver to read the contents of the SPI flash memory into a file, by running a file called ReWriter_binary.exe. The data in the SPI is stored in volumes using the Firmware File System (FFS). It then parses the volues to search for the Ip4Dxe file. This file along with DXE Core is then modified to add the malicious UEFI module to it post which the entire file is written back to the SPI memory. If the configuration allows write access to SPI the malware immediately writes to the SPI memory but if write access is disabled it exploited a race condition vulnerability in the BIOS locking mechanism to bypass the write protection in SPI flash memory. (CERT, 2015)

C. MoonBounce: UEFI Bootkit

The MoonBounce Bootkit is the third instance of malware that uses UEFI to gain persistence, with Lojax and MosaicRegressor being the other two instances where it was used.

MoonBounce is a lot more sophisticated than the previous iterations and it executes completely in the system memory without writing anything to the hard drive making it a lot harder to detect than the previous iterations of the malware. It stages the execution and deployment of payloads over the internet allowing the attacker to deploy payloads on the system to achieve specific tasks.
MoonBounce was detected in spring 2021 and like the previous iterations attacks the DXE Core module in UEFI to infect the SPI Memory.

D. Using TPM Module & Trusted Computing to protect against this attack

The TPM Module in the modern machines is designed to provide hardware-based, security-related functions and allows the system to secure the system using integrated cryptographic keys.

If TPM is enabled and is being used correctly then it gives the system a way to ensure that all firmware and boot files are unmodified. If any of the files are modified then they will not pass the cryptographic check and the boot process will be halted. This would prevent the infected SPI memory from being loaded and would warn the defenders that their system has been breached.

Unfortunately, it is possible to disable the TPM chip for historical compatibility reasons, so the malware can do the same. One of the ways to disable the check and bypass the Secure Boot & TPM check is to modify the registry files in Windows. The steps to do so are very simple and are shown below (Tibbetts, 2021):

  • At the run prompt type in regedit, and press Enter.
  • Go to Computer\HKEY_LOCAL_MACHINE\SYSTEM\Setup
  • Right-click on Setup and click New > Key. Name that LabConfig
  • Click on LabConfig, then right-click on the right pane, and click New > DWORD (32-bit Value).
  • Name the entry as BypassTPMCheck and change its Value data to 1
  • Create two more DWORDS and change the Value data to 1 just like you did above and name them BypassRAMCheck and BypassSecureBootCheck.

This removes the check for Secure Boot and while it can be desired at times it does open up the system to risk so should only be used for specific use cases where no other option is available.

Protecting against malware using firmware level persistence

To protect against this threat, we need to ensure that all components of the operating system and software on the computer are patched and updated to the latest version. We should enable end-point monitoring and IDS on the network to detect infection attempts. This will allow us to detect the malware before it infects the system and block it pre-emptively. The internet and email gateways should scan all incoming files to detect and block malware. In addition to the standard precautions to protect against malware, we should also ensure that all systems on the network are running the latest version of the UEFI/BIOS available.

Unfortunately, the remediation of the security issues in UEFI is a hard problem and doesn’t have an easy solution. So, the best way to protect against the threat is to try to prevent the system from getting infected in the first place.

Another option to detect infected SPI Memory is to create a tool that periodically creates a dump of the SPI memory and compares the checksum of the dump with a known clean dump. If the values don’t match then there is a high probability that the memory is infected and the administrators can then take steps to clean the firmware by flashing it with a known clean version of the firmware.

With the new methods of persistence available to the malware writers the best way to protect the assets is to try to ensure that you prevent the infection from happening in the first place. Once the machine is infected the task becomes harder and we would need to spend extra time and effort to clean and restore the systems to a clean state.
Done correctly this will decrease the risk of data exfiltration but no technique to detect infection is perfect so a lot of review and audits need to be done on a periodic basis to ensure that the system is still secure.


CERT. (2015, January 5). CERT/CC Vulnerability note vu#766164. VU#766164 – Intel BIOS locking mechanism contains race condition that enables write protection bypass. Retrieved March 21, 2022, from

Cisco. (2021, July 30). What is malware? – definition and examples. Cisco. Retrieved March 21, 2022, from
ESET Research. (2018, October 9). Lojax: First UEFI rootkit found in the wild, courtesy of the Sednit Group. WeLiveSecurity. Retrieved March 21, 2022, from

Neumann, J. V. (1951). Massachusetts Institute of Technology. Theory of Self Replicating Automata. Retrieved March 21, 2022, from
Tibbetts, T. (2021, July 10). How to bypass secure boot & trusted platform module. Providing Free and Editor Tested Software Downloads. Retrieved March 21, 2022, from

This was a paper for my Class in Q1 2022 which is why it is more formal than my usual posts.

April 28, 2022

Microsoft finds a Linux flaw that grants root access to untrusted users

Filed under: Computer Security,Linux/Unix Related,Tech Related — Suramya @ 11:30 AM

Now that is not a heading I thought I would ever write… I mean 20 years ago imagining that Microsoft would be working with Linux to the point where it would find and report a bug in Linux was unimaginable. For the longest time MS considered Linux to be a massive danger to it’s operations which is why former Microsoft CEO Steve Ballmer famously branded Linux “a cancer that attaches itself in an intellectual property sense to everything it touches” back in 2001. However that has now changed and Windows now has a Windows Subsystem for Linux (wsl) that allows users to run Linux programs from within Windows seamlessly.

This particular flaw which is tracked as CVE-2022-29799 and CVE-2022-29800 combine threats including directory traversal, symlink race, and time-of-check time-of-use (TOCTOU) race condition to gain root access. It was found when a Microsoft researcher Jonathan Bar Or was examining the code for a component known as “_run_hooks_for_state”. The flow to exploit would look something like the following (Thanks ARS Technica for the walkthrough):

Prepare a directory ”/tmp/nimbuspwn” and plant a symlink ”/tmp/nimbuspwn/poc.d“ to point to “/sbin”. The “/sbin” directory was chosen specifically because it has many executables owned by root that do not block if run without additional arguments. This will abuse the symlink race issue we mentioned earlier.
For every executable filename under “/sbin” owned by root, plant the same filename under “/tmp/nimbuspwn”. For example, if “/sbin/vgs” is executable and owned by root, plant an executable file “/tmp/nimbuspwn/vgs” with the desired payload. This will help the attacker win the race condition imposed by the TOCTOU vulnerability.
Send a signal with the OperationalState “../../../tmp/nimbuspwn/poc”. This abuses the directory traversal vulnerability and escapes the script directory.
The networkd-dispatcher signal handler kicks in and builds the script list from the directory “/etc/networkd-dispatcher/../../../tmp/nimbuspwn/poc.d”, which is really the symlink (“/tmp/nimbuspwn/poc.d”), which points to “/sbin”. Therefore, it creates a list composed of many executables owned by root.
Quickly change the symlink “/tmp/nimbuspwn/poc.d” to point to “/tmp/nimbuspwn”. This abuses the TOCTOU race condition vulnerability—the script path changes without networkd-dispatcher being aware.
The dispatcher starts running files that were initially under “/sbin” but in truth under the “/tmp/nimbuspwn” directory. Since the dispatcher “believes” those files are owned by root, it executes them blindly with subprocess.Popen as root. Therefore, our attacker has successfully exploited the vulnerability.

The vulnerability has been patched in the networkd-dispatcher and users running vulnerable systems should patch immediately.

Source: Microsoft finds Linux desktop flaw that gives root to untrusted users

– Suramya

April 25, 2022

Rainbow Algorithm (one of the candidates for post-quantum Cryptography) can be broken in under 53 hours

Quantum Computing has the potential to make the current encryption algorithms obsolete once it gets around to actually being implemented on a large scale. But the Cryptographic experts in charge of such things have been working on Post Quantum Cryptography over the past few years to offset this risk. After three rounds they had narrowed down the public-key encryption and key-establishment algorithms to Classic McEliece, CRYSTALS-KYBER, NTRU, and SABER and te finalists for digital signatures are CRYSTALS-DILITHIUM, FALCON, and Rainbow.

Unfortunately for the Rainbow algorithm, Ward Beullens at IBM Research Zurich in Switzerland managed to find the corresponding secret key for a given Rainbow public key in 53 hours using a standard laptop. This would allow anyone with a laptop to ‘prove’ they were someone else by producing the secret key for a given public key.

The Rainbow signature scheme [8], proposed by Ding and Schmidt in 2005, is one of the oldest and most studied signature schemes in multivariate cryptography. Rainbow is based on the (unbalanced) Oil and Vinegar signature scheme [16, 11], which, for properly chosen parameters, has withstood all cryptanalysis since 1999. In the last decade, there has been a renewed interest in multivariate cryptography, because it is believed to resist attacks from quantum adversaries. The goal of this paper is to improve the cryptanalysis of Rainbow, which is an important objective because Rainbow is currently one of three finalist signature
schemes in the NIST Post-Quantum Cryptography standardization project.

This obviously disqualifies the algorithm from being standardised as it has a known easily exploitable weakness. It goes on to prove that cryptography is not easy and the only way to ‘prove’ the strength of an algorithm is to let others test them for vulnerabilities. Or as Bruce Schneier put it in Schneier’s Law: ‘Anyone can create an algorithm that they themselves can’t break.’ , you need others to validate that claim.

Paper: Breaking Rainbow Takes a Weekend on a Laptop by Ward Beullens (PDF)
Source: New Scientist: Encryption meant to protect against quantum hackers is easily cracked

– Suramya

April 22, 2022

Implications and Impact of Quantum Computing on Existing Cryptography

As all of you are aware the ability to break encryption of sensitive data like financial systems, private correspondence, government systems in a timely fashion is the holy grail of computer espionage. With the current technology it is unfeasible to break the encryption in a reasonable timeframe. If the target is using a 256-bit key an attacker will need to try a max of 2256 possible combinations to brute-force it. This means that even with the fastest supercomputer in the world will take millions of years to try all the combinations (Nohe, 2019). The number of combinations required to crack the encryption key increase exponentially, so a 2048-bit key has 22048 possible combinations and will take correspondingly longer time to crack. However, with the recent advances in Quantum computing the dream of breaking encryption in a timely manner is close to becoming reality in the near future.

Introduction to Quantum Computing

So, what is this Quantum computing and what makes it so special? Quantum computing is an emerging technology field that leverages quantum phenomena to perform computations. It has a great advantage over conventional computing due to the way it stores data and performs computations. In a traditional system information is stored in the form of bits, each of which can be either 0 or 1 at any given time. This makes a ‘bit’ the fundamental using of information in traditional computing. A Quantum computer on the other hand uses a ‘qubit’ as its fundamental unit and unlike the normal bit, a qubit can exist simultaneously as 0 and 1 — a phenomenon called superposition (Freiberger, 2017). This allows a quantum computer to act on all possible states of a qubit simultaneously, enabling it to perform massive operations in parallel using only a single processing unit. In fact, a theoretical projection has postulated that a Quantum Computer could break a 2048-bit RSA encryption in approximately 8 hours (Garisto, 2020).

In 1994 Peter W. Shor of AT&T deduced how to take advantage of entanglement and superposition to find the prime factors of an integer (Shor, 1994). He found that a quantum computer could, in principle, accomplish this task much faster than the best classical calculator ever could. He then proceeded to write an algorithm called Shor’s algorithm that could be used to crack the RSA encryption which prompted computer scientists to begin learning about quantum computing.

Introduction to Current Cryptography

Current security of cryptography relies on certain “hard” problems—calculations which are practically impossible to solve without the correct cryptographic key. Just as it is easy to break a glass jar but difficult to stick it back together there are certain calculations that are easy to perform but difficult to reverse. For example, we can easily multiply two numbers to get the result, however it is very hard to start with the result and work out which two numbers were multiplied to produce it. This becomes even more hard as the numbers get larger and this forms the basis of algorithms like the RSA (Rivest et al., 1978) that would take the best computers available billions of years to solve and all current IT security aspects are built on top of this basic foundation.

There are multiple ways of classifying cryptographic algorithms but in this paper, they will be classified based on the keys required for encryption and decryption. The main types of cryptographic algorithms are symmetric cryptography and asymmetric cryptography.

Symmetric Cryptography

Symmetric cryptography is a type of encryption that uses the same key for both encryption and decryption. This requires the sender and receiver to exchange the encryption key securely before encrypted data can be exchanged. This type of encryption is one of the oldest in the world and was used by Julius Caesar to protect his communications in Roman times (Singh, 2000). Caesar’s cipher, as it is known is a basic substitution cypher where a number is used to offset each alphabet in the message. For example, if the secret key is ‘4’ then each alphabet would be replaced with the 4th letter down from it, i.e. A would be replaced with E, B with F and so on. Once the sender and receiver agree on the encryption key to be used, they can start communicating. The receiver would take each character of the message and then go back 4 letters to arrive at the plain-text message. This is a very simple example, but modern cryptography is built on top of this principle.

Another example is from world war II during which the Germans were encrypting their transmissions using the Enigma device to prevent the Allies from decrypting their messages as they had in the first World War (Rijmenants, 2004). Each day both the receiver and sender would configure the gears and specific settings to a new value as defined by secret keys distributed in advance. This allowed them to transmit information in an encrypted format that was almost impossible for the allied forces to decrypt. Examples of symmetric encryption algorithms include Advanced Encryption Standard (AES), Data Encryption Standard (DES), and International Data Encryption Algorithm (IDEA).

Symmetric encryption algorithms are more efficient than asymmetric algorithms and are typically used for bulk encryption of data.

Asymmetric Cryptography

Unlike symmetric cryptography asymmetric cryptography uses two keys, one for encryption and a second key for decryption (Rouse et al., 2020). Asymmetric cryptography was created to address the problems of key distribution in symmetric encryption and is also known as public key cryptography. Modern public key cryptography was first described in 1976 by Stanford University professor Martin Hellman and graduate student Whitfield Diffie. (Diffie & Hellman, 1976)

Asymmetric encryption works with public and private keys where the public key is used to encrypt the data and the private key is used to decrypt the data (Rouse et al., 2020). Before sharing data, a user would generate a public-private keypair and they would then publish their public key on their website or in key management portals. Now, whoever wants to send private data to them would use their public key to encrypt the data before sending it. Once they receive the cipher-text they would use their private key to decrypt the data. If we want to add another layer of authentication to the communication, the sender would encrypt the data with their private key first and then do a second layer of encryption using the recipient’s public key. The recipient would first decrypt the message using their private key, then decrypt the result using the senders public key. This validates that the message was sent by the sender without being tampered. Public key cryptography algorithms in use today include RSA, Diffie-Hellman and Digital Signature Algorithm (DSA).

Quantum Computing vs Classical Computing

Current state of Quantum Computing

Since the early days of quantum computing we have been told that a functional quantum computer is just around the corner and the existing encryption systems will be broken soon. There has been significant investment in the field of Quantum computers in the past few years, with organizations like Google, IBM, Amazon, Intel and Microsoft dedicating a significant amount of their R&D budget to create a quantum computer. In addition, the European Union has launched a Quantum Technologies Flagship program to fund research on quantum technologies (Quantum Flagship Coordination and Support Action, 2018).

As of September 2020, the largest quantum computer is comprised of 65 qubits and IBM has published a roadmap promising a 1000 qbit quantum computer by 2023 (Cho, 2020). While this is an impressive milestone, we are still far away from a fully functional general use quantum computer. To give an idea of how far we still have to go Shor’s algorithm requires 72k3 quantum gates to be able to factor a k bits long number (Shor, 1994). This means in order to factor a 2048-bit number we would need a 72 * 20483 = 618,475,290,624 qubit computer which is still a long way off in the future.

Challenges in Quantum Computing

There are multiple challenges in creating a quantum computer with a large number of qubits as listed below (Clarke, 2019):

  • Qubit quality or loss of coherence: The qubits being generated currently are useful only on a small scale, after a particular no of operations they start producing invalid results.
  • Error Correction at scale: Since the qubits generate errors at scale, we need algorithms that will compensate for the errors generated. This research is still in the nascent stage and requires significant effort before it will be ready for production use.
  • Qubit Control: We currently do not have the technical capability to control multiple qubits in a nanosecond time scale.
  • Temperature: The current hardware for quantum computers needs to be kept at extremely cold temperatures making commercial deployments difficult.
  • External interference: Quantum computes are extremely sensitive to interference. Research at MIT has found that ionizing radiation from environmental radioactive materials and cosmic rays can and does interfere with the integrity of quantum computers.

Cryptographic algorithms vulnerable to Quantum Computing

Symmetric encryption schemes impacted

According to NIST, most of the current symmetric cryptographic algorithms will be relatively safe against attacks by quantum computer provided a large key is used (Chen et al., 2016). However, this might change as more research is done and quantum computers come closer to reality.

Asymmetric encryption schemes impacted

Unlike symmetric encryption schemes most of the current public key encryption algorithms are highly vulnerable to quantum computers because they are based on the previously mentioned factorization problem and calculation of discrete logarithms and both of these problems can be solved by implementing Shor’s algorithm on a quantum computer with enough qubits. We do not currently have the capability to create a computer with the required number of qubits due to challenges such as loss of qubit coherence due to ionizing radiation (Vepsäläinen et al., 2020), but they are a solvable problem looking at the ongoing advances in the field and the significant effort being put in the field by companies such as IBM and others (Gambetta et al., 2020).

Post Quantum Cryptography

The goal of post-quantum cryptography is to develop cryptographic algorithms that are secure against quantum computers and can be easily integrated into existing protocols and networks.

Quantum proof algorithms

Due to the risk posed by quantum computers, the National Institute of Standards and Technology (NIST) has been examining new approaches to encryption and out of the initial 69 submissions received three years ago, the group has narrowed the field down to 15 finalists and has now begun the third round of public review of the algorithms (Moody et al., 2020) to help decide the core of the first post-quantum cryptography standard. They are expecting to end the round with one or two algorithms for encryption and key establishment, and one or two others for digital signatures (Moody et al., 2020).

Quantum Key Distribution

Quantum Key Distribution (QKD) uses the characteristics of quantum computing to implement a secure communication channel allowing users to exchange a random secret key that can then be used for symmetrical encryption (IDQ, 2020). QKD solves the problem of secure key exchange for symmetrical encryption algorithms and it has the capability to detect the presence of any third party attempting to eavesdrop on the key exchange. If there is an attempt by a third-party to eavesdrop on the exchange, they will create anomalies in the quantum superpositions and quantum entanglement which will alert the parties to the presence of an eavesdropper, at which point the key generation will be aborted (IDQ, 2020). The QKD is used to only produce and distribute an encryption key securely, not to transmit any data. Once the key is exchanged it can be used with any symmetric encryption algorithm to transmit data securely.


Development of a quantum computer may be 100 years off or may be invented in the next decade, but we can be sure that once they are invented, they will change the face of computing forever including the field of cryptography. However, we should not panic as this is not the end of the world as the work on quantum resistant algorithms is going much faster than the work on creating a quantum computer. The world’s top cryptographic experts have been working on Quantum safe encryption for the past three years and we are nearing the completion of the world’s first post-quantum cryptography standard (Moody et al., 2020). Even if the worst happens and it is not possible to create a quantum safe algorithm immediately, we still have the ability to encrypt and decrypt data using one-time pads until a safer alternative or a new technology is developed.


Chen, L., Jordan, S., Liu, Y.-K., Moody, D., Peralta, R., Perlner, R., & Smith-Tone, D. (2016). Report on Post-Quantum Cryptography.

Cho, A. (2020, September 15). IBM promises 1000-qubit quantum computer-a milestone-by 2023. Science.

Clarke, J. (2019, March). An Optimist’s View of the Challenges to Quantum Computing. IEEE Spectrum: Technology, Engineering, and Science News.

Diffie, W., & Hellman, M. (1976). New directions in cryptography. IEEE Transactions on Information Theory, 22(6), 644–654.

Freiberger, M. (2017, October 1). How does quantum computing work?

Gambetta, J., Nazario, Z., & Chow, J. (2020, October 21). Charting the Course for the Future of Quantum Computing. IBM Research Blog.

Garisto, D. (2020, May 4). Quantum computers won’t break encryption just yet.

IDQ. (2020, May 6). Quantum Key Distribution: QKD: Quantum Cryptography. ID Quantique.
Moody, D., Alagic, G., Apon, D. C., Cooper, D. A., Dang, Q. H., Kelsey, J. M., Yi-Kai, L., Miller, C., Peralta, R., Perlner R., Robinson A., Smith-Tone, D., & Alperin-Sheriff, J. (2020). Status report on the second round of the NIST post-quantum cryptography standardization process.

Nohe, P. (2019, May 2). What is 256-bit encryption? How long would it take to crack?
Quantum Flagship Coordination and Support Action (2018, October). Quantum Technologies Flagship.

Rijmenants, D. (2004). The German Enigma Cipher Machine. Enigma Machine.

Rivest, R. L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2), 120–126.

Rouse, M., Brush, K., Rosencrance, L., & Cobb, M. (2020, March 20). What is Asymmetric Cryptography and How Does it Work? SearchSecurity.

Shor, P. w. (1994). Algorithms for quantum computation: discrete logarithms and factoring. Proceedings 35th Annual Symposium on Foundations of Computer Science, 124–134.

Singh, S. (2000). The code book: The science of secrecy from Egypt to Quantum Cryptography. Anchor Books.

Vepsäläinen, A. P., Karamlou, A. H., Orrell, J. L., Dogra, A. S., Loer, B., Vasconcelos, F., David, K. K., Melville A. J., Niedzielski B. M., Yoder J. L., Gustavsson, S., Formaggio J. A., VanDevender B. A., & Oliver, W. D. (2020). Impact of ionizing radiation on superconducting qubit coherence. Nature, 584(7822), 551–556.

Note: This was originally written as a paper for one of my classes at EC-Council University in Q4 2020, which is why the tone is a lot more formal than my regular posts.

– Suramya

April 21, 2022

It is possible to plant Undetectable Backdoors in Machine Learning Models

Machine learning (ML) is the big thing and ML algorithms are slowly creeping into all aspects of our life such as unlocking your phone using facial recognition, evaluating the eligibility for a loan, surveillance, what ads you see when surfing the web, what search results you get when searching for stuff etc etc. The problem is that ML algorithms are not infallible they depend on the training data used, confirmational bias etc. At the very least they enforce the existing bias for example, if a company only hires men 25-45 for a role then the ML data set will take this as the input and all future candidates will be evaluated against this criteria because the system thinks that this is what a success looks like. The algorithms themselves are getting more and more complicated and it is almost impossible to review and validate the findings. Due to this decisions are being made by machines that can’t be audited easily. Plus it doesn’t help that most ML models are proprietary and the companies refuse to let outsiders examine them due to Trade secrets and proprietary information used in them.

Another problem is that these ML models is adversarial perturbations where attackers make minor changes to the image/data going in to get a specific response/output. There are a lot of examples of this in the past few years and some of them are listed below (Thanks to Cory Doctorow for consolidating them in one place)

These all take advantage of flaws in the ML model that can be exploited using minor changes in the input data. However, there is another major exploit surface available which is incredibly hard to protect against: Backdoors in the ML models by creating a model that will accept a particular entry/key to produce a specific output. The ‘best’ part is that it is almost impossible to detect if this has been done because the model will function exactly the same as an un-tampered model and will only show the abnormal behavior for the specific key which would have been randomly generated by the creator during the training. If done well then the modifications will be undetectable for most tests.

A team for MIT and IAS has written a paper on it (“Planting Undetectable Backdoors in Machine Learning Models“) where they go into details of how this can be done and the potential impact. Unfortunately, they have not been able to come up with a feasible defense against this attack as of this time. Hopefully that will change as others start focusing on this problem and how to solve it.

Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate “backdoor key”, the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees.

First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given black-box access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm or in Random ReLU networks. In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is “clean” or contains a backdoor.

Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, our construction can produce a classifier that is indistinguishable from an “adversarially robust” classifier, but where every input has an adversarial example! In summary, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness.

The paper is still waiting for the peer-review to complete but the concept and methods they describe seem solid so this is a problem we will have to solve sooner rather than later considering the speed with which ML models are impacting our life.

Source: Schneier on Security: Undetectable Backdoors in Machine-Learning Models

– Suramya

Older Posts »

Powered by WordPress