Suramya's Blog : Welcome to my crazy life…

January 29, 2022

Getting random values from the quantum fluctuations of vacuum using an API

Filed under: Computer Security,Interesting Sites — Suramya @ 10:35 PM

Generating truly random numbers programmatically is something that sounds like it should be simple to do but is in fact quite hard. Most algorithms that generate numbers are in fact pseudo-random numbers, which means that they look random but can be predicted at times. So the ability to generate/get truly random numbers is a big deal. Cloudflare uses a wall to wall setup of Lava Lamps to generate random numbers that are used to encrypt the traffic on their servers. Other organizations have other methods where they measure the atmospheric radiation, sound etc etc.

The ANU QRNG website managed by Australian National University offers true random numbers to anyone on the internet. The random numbers are generated in real-time in the lab by measuring the quantum fluctuations of the vacuum.

They have API access enabled for accessing the numbers and users can download blocks of random numbers as well as a .zip file which is updated periodically.

The vacuum is described very differently in the quantum physics and classical physics. In classical physics, a vacuum is considered as a space that is empty of matter or photons. Quantum physics however says that that same space resembles a sea of virtual particles appearing and disappearing all the time. This is because the vacuum still possesses a zero-point energy. Consequently, the electromagnetic field of the vacuum exhibits random fluctuations in phase and amplitude at all frequencies. By carefully measuring these fluctuations, we are able to generate ultra-high bandwidth random numbers.

This website allows everybody to see, listen or download our quantum random numbers, assess in real time the quality of the numbers generated and learn more about the physics behind it. The technical details on how the random numbers are generated can be found in Appl. Phys. Lett. 98, 231103 (2011) and Phys. Rev. Applied 3, 054004 (2015).

I think this is a cool application and a lot of reputable sites/users are using this for their setup so it seems like a reputable source of random numbers. I would still take these numbers and then use that as the seed in a pseudo-random generator and use that result in your application instead of using the number directly.

– Suramya

January 28, 2022

IoT Devices and Reducing their Impact on Enterprise Security

Filed under: Computer Security,My Thoughts,Security Tutorials — Suramya @ 11:34 PM

IoT devices are becoming more and more prevalent in the corporate world, as they allow us to automate tasks and activities without manual intervention, which increases the risk to the organization by increasing the attack surface available to attackers. This is because IoT devices can act as entry points to the organization’s internal network. In order to reduce the security impact of these devices the attack channels and threats from the devices need to be mitigated. This can be done by implementing the suggestions in this paper

IoT or Internet of Things is a collection of devices that are connected to the internet and can be controlled over a network or provide data over the internet. It is one of the fastest growing markets, with enterprise IoT spending growing by 24% in 2021 from $128.9 billion. (IoT Analytics, 2021). This massive growth brings new challenges to the table as administrators need to secure IoT devices in their network to prevent them from being security threats to the network.

IoT devices allow us to manage, monitor and control devices and sensors remotely which in turn allows us to automate tasks and activities without manual intervention. But this capacity comes at an increased risk of vulnerability due to a massive increase of the attack surface available. They are becoming more and more prevalent in an enterprise setting, especially in the office automation and operational technology areas. This increases the risk to the organization by increasing the possibility of threats in areas that traditionally don’t pose cyber security risks.

IoT devices can act as entry points to an organizations internal network and be used to exfiltrate data from the network without raising flags. In 2018, attackers used a compromised IoT thermometer in the lobby aquarium of a casino to breach their system and exfiltrate their high-roller database (~10GB of data) out of the corporate network to servers they controlled via the thermostat. (Williams-Grut, 2018).

In this paper we will review some of the major threats and attack channels targeting IoT devices and look at how we can reduce the impact of these threats on the enterprise security.

IoT Threats and Attack Channels

IoT devices have multiple attack surfaces due to their design and usage. We will cover the major vulnerabilities in this section along with mitigation steps for each threat and attack channel.

A. Physical Vulnerabilities

Since these devices are usually physically deployed in the field in addition to the typical software and communication vulnerabilities, they are also vulnerable to physical attacks where the device can be physically modified to gain access. Some of the examples of Physical attacks are as follows:

  • Attackers physically remove the device memory or flash chips to read & analyze the data and software on the chip.
  • Attackers tamper with the microcontroller to gain access to or identify sensitive information
  • Physically modify the device to return incorrect data or telemetry. For example, camera’s or motion sensors overseeing sensitive locations could be modified to ignore breaches.
  • Use the device connectivity to act as a bridge to gain access to the corporate network.
  • Attackers authenticate locally to the device using debug interface on the device to gain access to the device internals

The best way to protect against such attacks is to ensure the following preventive measures are taken for all devices on the network:

  • Ensure that the device or sensor is not easily accessible physically.
  • All sensors and devices should have tamper proof seals installed on them with regular checks to verify that they are not tampered with.
  • Unused ports, connections, diagnostic connectors etc should be physically disabled when possible.
  • If possible, ensure the devices have hardware-based security checks on it.

B. Outdated Firmware

Many of the IoT devices and sensors run older versions of Linux with no easy way to update the firmware, installed software or applications to the latest versions. This creates a major security risk as the device is running software with known security vulnerabilities which allows attackers to easily compromise a device.

There is no easy way to resolve this problem and protect the devices as a lot of these sensors and devices are not designed with security in mind. The best way to approach this problem is to ensure you are working with reputable device manufacturers who will ensure that appropriate support and updates are going to be available for the device/sensor.

The organization should review the recommendations by the IoT working group of the Cloud Security alliance on how to perform IoT Firmware updates securely and regularly. (Khemissa et al., 2018) The should also include the IoT sensors and devices in the organization’s update cycles which will allow them to ensure that patches and updates are installed in a timely manner on them.

Another option is to explore installing open source firmware and software on the IoT device/sensor if this option is available. The opensource firmware’s are usually updated more frequently and can be customized to better secure the device.

C. Hard Coded Passwords/Accounts

Some of the IoT devices have hard coded account passwords that cannot be changed, and this gives an attacker backdoor access to the device that is difficult to protect against. Hardcoded passwords are particularly dangerous because they are easy targets for password guessing exploits, allowing attackers to hijack firmware, devices, systems, and software etc. A famous case of such an exploit was found in 2017 when researchers found default hardcoded passwords in IoT camera’s manufactured by Foscam. (Heller, 2017) that gave admin access to anyone who used them. These passwords allow an attacker to gain access to the device and use it as a launch surface against attacks on the network.

Another famous attack exploiting this was by the Mirai malware in 2016. It scanned for and exploited Linux-based IoT boxes with Busybox (such as DVRs and WebIP Cameras) using hardcoded usernames and passwords. Once it gained access these devices were enrolled in a botnet containing over 400,000 connected devices which were then used to perform DDoS attacks on major companies across the world. (Fruhlinger, 2018)

To protect against these attacks, we should ensure the default passwords on all devices are changed frequently. An active pentest against the device should be conducted to uncover any hidden or hardcoded accounts. If any are found, the manufacturer should be contacted to prove an update to disable these accounts.

D. Poor IoT device management

A study published in July 2020 found that almost 15% of IoT devices on an enterprise network were unknown or unauthorized and between 5 to 19% of these devices were using unsupported legacy operating systems (Help Net Security, 2020). These devices make up what is known as a Shadow IoT network that is implemented without the knowledge of the organization’s IT team and can be a major weak point in the organization’s security perimeter.

The best way to protect against this scenario is to ensure regular scans are done on the network to identify any unknown or new devices connected to the network. The pentest will enable us to identify these unauthorized devices which can then be incorporated into the official network and update cycle or disconnected depending the requirements. Another way to find these unauthorized devices is to monitor and analyze network connections and traffic. New devices will change the network data flow, and this can be used to identify or locate new devices or sensors connected to the network.

E. Man-in-the-Middle Attacks

Communication channels in IoT devices are usually very trivially protected and an attacker can compromise the channel to intercept the messages between devices and modify them. This allows the attacker to cause malfunctions or show incorrect data. This can potentially cause serious harm if the targeted IoT devices are connected to or managing industrial or medical equipment. It can also allow attackers to hide their tracks and physical evidence of their work.

F. Industrial Espionage & Eavesdropping

IoT devices such as cameras, microphones etc are used to monitor sensitive areas or devices for problems remotely. If an attacker compromises these cameras, they allow them to visually and audially monitor their target compromising their privacy and potentially gaining access to sensitive data or video. For example, IoT cameras deployed in bedrooms have been used to record and leak intimate videos of the residents without their knowledge. Compromised security cameras have been used to record ATM pins entered by unsuspecting users.

Other steps that should be taken to reduce risk from IoT devices on your network:

  • Segregate your Networks: IoT devices should be on a separate segment of the network which is isolated from the production and user network with a firewall sitting between the two. This will allow you to block access to the production network from the IoT network which will prevent an attacker from gaining full access to the enterprise network in case they breach the IoT network.
  • Enable HTTPS/Encrypted connectivity for IoT devices: All connections to and from the IoT devices should be encrypted to protect against Man-in-the-middle attacks.
  • Deploy an IDS: Deploying an Intrusion Detection System (IDS) on the network can alert us to attack attempts. All alerts from the IDS should be investigated and verified.

These are just some of the attack surfaces available to attackers targeting IoT devices, in fact with the increase in computing power available to these devices they are almost mini computers and most of the attacks that impact traditional systems such as servers or desktops can target IoT devices as well with minimal modifications. So, it is essential that security trainings are conducted for all employees in the organization to make them aware of the risks posed by IoT devices and train the security team in methods to secure these devices from attackers.

Note: This was originally written as a paper for one of my classes at EC-Council University in Q3 2021, which is why the tone is a lot more formal than my regular posts.

– Suramya

January 27, 2022

New MoonBounce UEFI Bootkit that can’t be removed by replacing the Hard Disk

Filed under: Computer Security,Computer Software,My Thoughts — Suramya @ 1:05 AM

Viruses and malware have evolved a lot in the past 2-2.5 decades. I remember the first virus that infected my computer back in 1998, it corrupted the boot sector and the partition table to the point where I couldn’t even format the drive as it wasn’t detected by the OS. I tried booting via a floppy and running scandisk on it (this is on DOS 6.1/Windows 3.1) but it wouldn’t detect the disk, same issue with Norton Disk Doctor (NDD). Was scared to tell the parents that I had broken the new computer but after a whole night of trying various things based on conversations with friends, suggestions in books etc I managed to get NDD to detect the disk and repair the partition table. After that it was a relatively simple task to format the disk and reinstall DOS. Similarly all the other viruses I encountered could be erased by formatting the disk or replacing it.

There were a few that tried using the BIOS for storing info but not many. I did create a prank program that would throw insults at you when you typed the wrong command every 5th boot. The counter for the boot was kept in the BIOS. But this didn’t have any propagation logic in the code and had to be manually run on each machine, plus it had to be customized manually for very new BIOS type/version so wasn’t something that could spread on its own.

With the new malware/viruses that have come out in the past few decades we are seeing more advanced capabilities of propagation and persistence, but till now you could still replace the drive infected with a virus and be able to start with a clean slate. However, that has now changed with the new MoonBounce UEFI Bootkit which can’t be removed by replacing the Hard Drive as it stores itself in the SPI flaws memory that is found on the motherboard. Which means that the bootkit will remain on the device till the SPI memory is re-flashed or the whole motherboard is replaced. Which makes it very difficult and expensive to recover from the infection.

Securelist has a very detailed breakdown of the Bootkit which you should check out. The scary part is that this is not the only bootkit that uses this method, there are a few others such as ESPectre, FinSpy’s UEFI bootkit that prove that the capability is becoming more mainstream and that we should expect to see more such bootkits in the near future.

Source: Slashdot: New MoonBounce UEFI Bootkit Can’t Be Removed by Replacing the Hard Drive

– Suramya

January 26, 2022

Got a new Biometric lock installed

Filed under: My Life — Suramya @ 3:39 AM

Yesterday I finally replaced my old Biometric lock that I have been using for the past 8 years with a newer model. The old one was still working fine for the most part but gave a fright a few weeks ago when its batteries died (I think that is what happened) and I couldn’t unlock the door. We did have a manual override key for the lock but I guess I don’t know my own strength because I broke off the key (in the lock) when I tried to unlock using the key. It looked like I would have to break the lock to get in but thankfully I remembered at the last minute that the lock had the option of providing power externally and was able to unlock using a 9 volt battery. Due to this and other small issues that were cropping up in the lock we decided to replace the lock with a newer version.

Searching online I found a lot of locks available but decided against most of them because I didn’t want the lock to be internet connected. There are enough security issues with the apps and I don’t like the idea of random folks being able to connect to my lock remotely for fun and profit. Finally narrowed down to two options, 1st was a godrej model and the other was the one we got. The Godrej one looked good but as per their support team required a door with a min thickness of 42mm and our door is only 35mm. We could have gotten extra plywood put in to thicken the door but since the other option was 10k cheaper, had more functionality and didn’t require modification we decided to go with that one instead.

Ordered the lock online and it was delivered in ~3 days, installation took a while because they took a while to assign a technician for some reason but after yelling at them for a bit (and offering to return the lock) it was finally installed yesterday. The installation person was pretty good and the whole thing took about 40 mins to complete.

Now with the new lock I can unlock the door with Finger prints, pin, RFID card and manual override key. In case of power going off it has the option of using a powerbank as external power so that is a relief. Plus it doesn’t require dismantling the handle to get access to the override key so that is a big advantage.

The new lock’s sensor is a lot more sensitive and processes faster than my old one. Thinking about what to do with the old one, one option is to send it to my parents place in Delhi another is to use it for secure storage here in Bangalore itself but that would require work and I honestly don’t have that many valuables that would require a biometric storage locker. In any case for now it is going into storage.

Well this is all for now, will post more later.

– Suramya

PS: I didn’t specify the lock model / make in the post specifically because I don’t think I want to make that public. But if you are interested in discussing more or are planning to buy you can reach out offline and we can talk in more detail.

January 25, 2022

Intentionally breaking popular opensource projects for… something

Filed under: Computer Software,My Thoughts — Suramya @ 10:23 AM

Recently Marak Squires, the developer of extremely popular npm modules Colors & Faker decided to intentionally commit changes into the code that broke the module and brought down thousands of apps world wide. Initially it was thought that the modules were hacked as others have been in the past, but looking at the commit history it was obvious that the changes were committed by the developer themselves. Which brings us to the question of why on earth would someone do something like this? Marak didn’t explicitly state on why the changes were made but considering their past comments it does seem like this was done intentionally:

In November 2020, Marak had warned that he will no longer be supporting the big corporations with his “free work” and that commercial entities should consider either forking the projects or compensating the dev with a yearly “six figure” salary.

“Respectfully, I am no longer going to support Fortune 500s ( and other smaller sized companies ) with my free work. There isn’t much else to say,” the developer previously wrote.

“Take this as an opportunity to send me a six figure yearly contract or fork the project and have someone else work on it.

The aftermath of the changes is that NPM has revoked the developers rights to commit code, their github account has been suspended and the modules in question have been forked. Now Marak is pleading for his accounts to be reinstated because the issue was caused due to a ‘programming mistake’ which seems like a far fetched excuse. Especially given how they made fun of the problem right after people reporting it. That doesn’t seem like the reaction we would see if this was a legitimate mistake.

My guess is that they thought this would play out differently with companies falling over themselves to give them money/contracts etc or something but didn’t anticipate how it would blow back on them. I mean if I was hiring right now and their resume came up I would think twice about hiring them because of this stunt. They have shown that they can’t be trusted and what is to stop them from making changes to my company’s software and bring it a screeching halt because they felt that they were not being paid their dues? I mean they have already done it once, what is to stop them from doing it again? This looks like a textbook example of what not to do in order to get people to work with you/hire you.

One of the things that I have heard from detractors of OpenSource software when I was pushing for it in my previous companies is the question about how can we be sure the software will be there a year for now and who do we blame if the software is broken and we need help. Stunts like this don’t help improving the image of Open Source software and this person is now reaping their just deserts.

The positive side is that because the code is opensource, it has already been forked and others have taken over the codebase to ensure we don’t hit similar issues going forward.

– Suramya

January 24, 2022

Citibank Bangalore – Doing a great job enforcing Covid Appropriate Behavior at their branch

Filed under: My Thoughts — Suramya @ 10:17 PM

Had to go to Citibank branch on MG Road, Bangalore today for some work and was extremely impressed by how well they are enforcing Covid Appropriate Behavior (CAB) at the branch. As I walked over to the entrance the security guard took my temperature and then asked me to wait outside as all the counters had people at them. As other folks came up to the gate the guy asked them to wait in queue and ensured all were maintaining 6 ft distance. A few people grumbled a little on being asked to wait outside but no one created a problem.

Once the counter freed up, I was allowed inside and proceeded with my work. Even while waiting inside, they ensured folks are not sitting close to each other and every single person inside was wearing a mask. It was nice to see such a well managed setup here. We can only control Covid spread by ensuring we all get vaccinated, wear a mask and follow CAB at all times.

– Suramya

January 23, 2022

Some thoughts on Crypto currencies and why it is better to hold off on investing in them

Filed under: Computer Related,My Thoughts,Techie Stuff — Suramya @ 1:26 AM

It seems that every other day (or every other hour if you are unlucky) someone or the other is trying to get people to use Crypto currency because they claim that it is awesome and not at all dependent on government regulations and thus won’t fluctuate that much. Famous people are pushing it, others like New York City Mayor Eric Adams are trying to raise awareness of the product and have decided to convert his first paycheck to Crypto, El Savador started accepting crypto currency as legal tender etc. However, the promises made by crypto enthusiasts don’t translate into reality as the market remains extremely volatile.

I see people posting on twitter that Crypto currencies are better because they are stable, but in my opinion if a currency can drop 20% because Elon Musk tweeted a Broken heart emoji then it is not something I want to use to store my savings. Earlier this week the entire Bitcoin market dropped over 47% from it’s high back in Nov 2021. Mayor Adams paycheck which was converted to crypto is now worth ~1/2 of what it was when he invested it, and that is a massive drop. Imagine loosing 50% of your savings in one shot. You might suddenly have no way to pay rent or emergency repairs/hospitalization etc. Even El Savador has seen its credit become 4 times worse than it was before it moved to Bitcoin. People there are complaining that the promised reduction in cost for conversion to/from international currencies is a myth as they are paying more than what they were paying earlier as transaction costs.

Another major issue with crypto currency is the ecological hit caused by the mining. According to research done by University of Cambridge, globally Bitcoin uses more power per year than the entire population of Argentina. The recent Kazakhsthan unrest and protests were sparked off due to surging fuel prices that were caused by the migration of Bitcoin miners to the country after China banned them. This caused a lot of strain on the electricity grid and required an increase in the prices which kicked off a massive protest that has caused untold no of deaths. There are multiple folks coming up with new crypto-currencies that claim to be carbon neutral but so far none of them have delivered on the promise.

Bitcoin is thought to consume 707 kwH per transaction. In addition, the computers consume additional energy because they generate heat and need to be kept cool. And while it’s impossible to know exactly how much electricity Bitcoin uses because different computers and cooling systems have varying levels of energy efficiency, a University of Cambridge analysis estimated that bitcoin mining consumes 121.36 terawatt hours a year. This is more than all of Argentina consumes, or more than the consumption of Google, Apple, Facebook and Microsoft combined.

Check out this fantastic (though very long – 2hr+) video on economic critique of NFTs, DAOs, crypto currency and web3. (H/t to Cory Doctorow)

In summary, I would recommend against investing in crypto currencies till the issues highlighted above are resolved (if they are ever resolved).

– Suramya

January 22, 2022

Malware can now Intercept and fake an iPhone reboot

Filed under: Computer Security,Computer Software,My Thoughts — Suramya @ 1:50 AM

Rebooting the system has always been a good way to clean start your system (phone or computer). Some of the phone malware specifically don’t have the ability to persist so can be removed just by rebooting the phone (Especially on the iPhone). Now, researchers from the ZecOps Research Team have figured out how to fake a reboot on an iPhone. Which allows malware/surveillance software to spoof the shutdown / reboot of a phone. As you can imagine, this has massive security impact. The first problem is that we can’t be sure that the phone has been rebooted so malware can’t be removed. Secondly, some of the folks shutdown their phones while discussing sensitive information. Using this technique the attackers can pretend that the phone is switched off, while it is still on and eavesdrop using the phone’s camera and mic.

We’ll dissect the iOS system and show how it’s possible to alter a shutdown event, tricking a user that got infected into thinking that the phone has been powered off, but in fact, it’s still running. The “NoReboot” approach simulates a real shutdown. The user cannot feel a difference between a real shutdown and a “fake shutdown.” There is no user-interface or any button feedback until the user turns the phone back “on.”

The problem is exacerbated due to there not being any physical method of powering the device off. Earlier phone models had removable batteries which allowed a user to physically remove the battery when they wanted to secure the device. Now the battery is built in and there is no way to remove it without dismantling the device and voiding your warranty in the process. I have discussed this with various folks over the years that it is impossible to ensure a device is powered off when we shut it down because we can’t remove the battery.

A silver lining around this is that it looks like hard reboots are harder to spoof so if you want to be sure that your phone is actually off, you can shut it down using a hard-reboot. Another solution is to carry a Faraday bag with you and put your phone inside when you need to be off-grid.

Source: Schneier’s Blog: Faking an iPhone Reboot

– Suramya

January 21, 2022

nerd-dictation: A fantastic Open Source speech to text software for Linux

After a long time of searching I finally found a speech to text software for Linux that actually works well enough that I can use it for dictating without having to jump through too many hoops to configure and use. The software is called nerd-dictation and is an open source software. It is fairly easy to setup as compared to the other voice-to-text systems that are available but still not at a stage where a non-tech savvy person would be able to install it easily. (There is effort ongoing to fix that)

The steps to install are fairly simple and documented below for reference:

  • pip3 install vosk
  • git clone
  • cd nerd-dictation
  • wget
  • unzip
  • mv vosk-model-small-en-us-0.15 model

nerd-dictation allows you to dictate text into any software or editor which is open so I can dictate into a word document or a blog post or even the command prompt. Previously I have used tried using software like which actually works quite well but doesn’t allow you to edit the text as you’re typing, so you basically dictate the whole thing and the system gives you the transcription after you are done. So, you have to go back and edit/correct the transcript which can be a pain for long dictations. This software works more like Microsoft dictate which is built into Word. Unfortunately my word install on Linux using Crossover doesn’t allow me to use the built in dictate function and I have no desire to boot into windows just so that I can dictate a document.

This downloads the software in the current directory. I set it up on /usr/local but it is up to you where you want it. In addition, I would recommend that you install one of the larger dictionaries/models which makes the voice recognition a lot more accurate. However, do keep in mind that the larger models use up a lot more memory so you need to ensure that your computer has enough memory to support the larger models. The smaller ones can run on systems as small as a raspberry pi, so depending on your system configuration you can choose. The models are available here.

The software does have some quirks, like when you are talking and you pause it will take it as a start of a new sentence and for some reason it doesn’t put a space after the last word. So unless you’re careful you need to go back and add spaces to all the sentences that you have dictated, which can get annoying. (I started manually pressing space everytime I paused to add the space). Another issue is that it doesn’t automatically capitalize the words when you dictate such as those at the beginning of the sentence or the word ‘I’. This requires you to go back and edit, but that being said it still works a lot better than the other software that I have used so far on Linux. For Windows system Dragon Voice Dictation works quite well but is expensive. I tested it out by typing out this post using it and for the most part it does work it worked quite well.

Running the software again requires you to run commands on the commandline, but I configured shortcut keys to start and stop the dictation which makes it very convenient to use. Instructions on how to configure custom shortcut keys are available here. If you don’t want to do that, then you can start the transcription by issuing the following command (assuming the software is installed in /usr/local/nerd-dictation):

/usr/local/nerd-dictation/nerd-dictation begin --vosk-model-dir=/usr/local/nerd-dictation/model  --continuous

This starts the software and tells it that we are going to dictate for a long time. More details on the options available are available on the project site. To stop the software you should run the following command:

/usr/local/nerd-dictation/nerd-dictation end

I suggest you try this if you are looking for a speech-to-text software for Linux. Well this is all for now. Will post more later.

Thanks to Hacker News: Nerd-dictation, hackable speech to text on Linux for the link.

– Suramya

January 20, 2022

Impact of Google Hacking and Data Collection using Search Engines on CyberSecurity

Filed under: Computer Security,My Thoughts — Suramya @ 1:58 AM

The modern search engines scan most of the public sites on a regular basis and unlike the legacy search engines also have the capability of finding and indexing data or files that are not linked to from any other sources. This allows the search engine to index data/files that could have sensitive data or details on vulnerabilities. Using publicly available information attackers can perform searches for such information without touching the target system directly leaving little trace for the defenders to watch for to be alerted. Most organizations are not aware of the information being leaked by such means and how it is compromising their cyber security. The availability of the Google Hacking Database allows even minimally skilled attackers to search for information quickly and efficiently.
This poses a high risk to the organizations leaking sensitive data. There are no sure shot solutions to this problem and even the most careful organizations will expose data that when combined with other sources allow attackers a look at the organizations digital assets and systems.

The popular image of a hacker involves an attacker sitting in a dark room typing commands in a terminal to gain access and usually is completed in a very short period of time. In real life attackers spend a lot of time performing reconnaissance on the target before even engaging with the target system. One of the popular ways of performing reconnaissance is to use search engines like Google to find data, this technique is called Google Hacking and was introduced to public in 2004 by Johnny Long. He defined it as “the art of creating complex search engine queries in order to filter through large amounts of search results for information related to computer security” (Johnny, 2004). Attackers use Google Hacking to uncover sensitive information about a company or uncover potential security vulnerabilities.

The modern search engines scan most of the public sites on a regular basis and unlike the legacy search engines also have the capability of finding and indexing data or files that are not linked to from any other sources. This allows the search engine to index data/files that could have sensitive data or details on vulnerabilities.

The Google Hacking Database (GHDB) is a consolidated database of queries that have been collected over the years thanks to contributions by researchers, hackers and general public that can be used to find sensitive data on websites such as files containing passwords, configurations, sensitive data, financial information, error messages, firewall logs and other such data. (Google Hacking Database, 2021) The database is in an easy to consume format and allows users to search for queries that will return specific types of data.

This database gives attackers the queries to be used to specific types of data, leveraging the indexing powers of Google for finding information that should not have been exposed to the public.

How Google Hacking Works

Google allows a user to search for information using search keywords and a combination of search operators to limit the search results. With the information available in the Google Hacking Database an attacker can search for specific information and limit the search to a given target domain. There are multiple kinds of queries available that target specific kinds of information. Some of the categories of information available using this are:

  • Advisories and vulnerabilities: Queries that allow us to locate vulnerable servers based on product or version-specific setups with known vulnerabilities..
  • Sensitive directories: Allow us to find directories with files that contain sensitive information
  • Files containing passwords: Locate files containing passwords.
  • Pages containing login portals: Locate login pages for various services
  • Error messages: Find files with errors messages that may contain details about the system.

Below are examples of the various queries that are available and the kind of data they expose.

Searching for passwords stored in files

Users sometimes store passwords in plain text files or excel databases that are accidentally uploaded to a public site. These are then indexed by Google (or other search engines) and can be found using specific queries. For example:

allintext:"*" OR "password" OR "username" filetype:xlsx

searches for all Excel files that have in the text along with “password”. This will find all files containing any of the search terms provided. If required we can limit the search to a specific site using the “site:” search parameter.

Search for Log files

Log files contain a lot of sensitive information if exposed to public. Error logs, access logs can expose information such as PHP version you are running, CMS version details, Operating system details etc. If firewall logs or system logs are exposed it can reveal information such as usernames, firewall version and configuration details etc. Similarly SQL logs can expose sensitive data as well. This information combined with other information can give an attacker a foothold in the system. For example:

allintext:username filetype:log

This query will give results that include the text username inside all *.log files and the following query will return all directories where logfiles are publicly accessible:

intitle:"index of" errors.log

SSH private keys

SSH private keys are used to encrypt/decrypt data exchanged during SSH connections. They also allow users to authenticate to servers without the use of passwords. If they are exposed anyone can impersonate that user and if passwordless login’s are enabled the key will allow the attacker to login to the server without a password. The following query will return all directories with publicly accessible private key:

intitle:index.of id_rsa

Login Portals

A lot of times organizations expose their development or staging systems to the internet for testing and depend on the obscurity of the system for protection. These systems are vulnerable because development systems often don’t have the same protections and controls applied on them as production systems do. In addition, there are often systems that were not meant to be pubic such as router login pages, CMS admin sections etc that increase the attack surface of the organization. A sample query to find login pages for CISCO email security appliance is listed below:

intitle:"Cisco Email Security Virtual Appliance" inurl:csrfkey=

SQL dumps

Sometimes sites require SQL datadumps to be made for backup or restoration purposes and these dumps often have a lot of sensitive data in them. Using a search query similar to the one listed below attackers can find these dumps and explore the data:

ext:sql | ext:txt intext:"-- phpMyAdmin SQL Dump --" + intext:"admin"

There are many more queries that are available in the database to search for specific data and more are added everyday.

Famous attacks that used Google Hacking/Google Dorks
Attacks using Google Hacking/Google Dorks are difficult to identify due to the passive nature of the attacks. However, even with that restriction there have been a few cases of note where the attacker’s used this technique to attack an organization’s system and some of them are listed below.

N.Y. Dam attack from Iran, 2013

Between 2011 and 2013, Hamid Firoozi from Iran gained access to the Bowman Avenue Dam in Rye Brook, New York by finding an unprotected computer that controlled the dam’s sluice gates using Google Searches. (Matthews, 2016). The issue is rampant enough that the Department of Homeland Security and FBI jointly released a warning about Google dorking. “By searching for specific file types and keywords, malicious cyber actors can locate information such as usernames and passwords, e-mail lists, sensitive documents, bank account details, and website vulnerabilities,” (FBI, 2014)

Detection of Google Hacking Attacks

Detection of these attacks is difficult due to the passive nature of the attack. However, one of the technique that is quite successful is to use a Honey Pot approach. Organizations can store files with fake information that looks authentic and important such as username and password combinations or SSH private keys that belong to non-existent accounts. Because these accounts do not exist no one should be attempting to log in to them for legitimate purposes so when a login attempt is made to these accounts or when the files are accessed we know that a Google Hacking attack is in progress and the IP address etc can be flagged for followup or blocking. We can also lure the system into a fake network which is monitored to identify what information they are looking for in the network.

Using that information, we can take further preventive measures to protect the system.

Prevention Techniques for Google Hacking attacks

There are a few steps that we can take to avoid leaking sensitive data to attackers using Google Dorks as listed below:

  • Protect sensitive data with authentication for private information
  • Don’t expose development systems to internet, if that is not possible restrict access using IP based restriction.
  • Run regular vulnerability scans on your website/domain. A lot of the scanners now incorporate checks for popular Google Dork queries
  • Run manual dork queries against your site to locate leaks before attackers do
  • Add checks to your servers to find sensitive files in public directories such as any file with an extension other than a php/asp/html. These can we potential leaks
  • If you find sensitive content exposed, you can request its removal by using the Google Search Console.


Google Hacking allows an attacker to perform reconnaissance against your organization in a passive way allowing them to collect information that can then be combined with other sources to give them a foot hold. Preventing such information leaks is a good way to protect the organizational systems and the techniques listed above can help with that. We can also subscribe to services that perform these checks on your behalf.

We covered some of the techniques available to detect and prevent Google Hacking attacks in the paper and while the techniques discussed will not protect against all attacks, they will reduce the attack surface and protect you against most attackers.

Note: This was originally written as a paper for one of my classes at EC-Council University in Q2 2021, which is why the tone is a lot more formal than my regular posts.

Older Posts »

Powered by WordPress