Suramya's Blog : Welcome to my crazy life…

October 9, 2023

Microsoft AI responds with absolute nonsense when asked about a prominent Cyber Security expert

Filed under: Artificial Intelligence,Computer Software — Suramya @ 11:39 PM

The more I read about the Microsoft implementation of ‘AI’ the more I wonder what on earth are they thinking? Their AI system is an absolute shambles and about 99% of the output is nonsense. See the example below:

I did not realise how inaccurate Microsoft's Al is. It's really bad
Microsoft AI returns absolute nonsense when asked about who Kevin Beaumont is

I did not realise how inaccurate Microsoft’s Al is. It’s really bad This is just one example – it lists a range of lawsuits I’ve filed, but they’re all fictional – it invented them and made up the citations. It says I gave Microsoft’s data to @briankrebs. It says Krebs is suing me. It says @malwaretech works for me. The list goes on and on. Very eyebrow raising this is being baked into next release of Windows 11 and Office. It will directly harm people who have no knowledge or recourse.

I mean I can understand if it got one or two facts wrong because the data sources might not be correct, but to get every single detail wrong requires extra skill. The really scary part is that Google AI search is not much better and both companies are in a race to replace their search engine with AI responses. Microsoft is going a step further and including it as a default option in Windows. I wonder how much of the user data being stored on a windows computer is being used to train these AI engines.

There needs to be an effort to create a search engine that filters out these AI generated responses and websites to go back to the old style search engines that actually returned useful & correct results.

– Suramya

August 29, 2023

Excel holding up the Global Financial System, now with Python support

Filed under: Computer Security,Computer Software,My Thoughts,Tech Related — Suramya @ 1:12 PM

It is both impressive and scary how much of the world’s financial systems is being run using Microsoft Excel. Folks have created formulars/macros/scripts/functions etc in Excel that allows them to generate data that is used to take major financial decisions with real world impact.

In one of my previous companies we actually had a full discussion on how to get an inventory of all the Excel code in use at the company and how to archive it so that we have backups and version control on them. Unfortunately, I left before much headway was made but I did learn enough about excel use to scare me. (Especially since I am not the biggest fan of Microsoft software 😉 )

Now you might ask why so many people are using excel when there are better tools available in the market and these companies have inhouse teams to create custom software for the analyst and I asked the exact same questions when I started. I think it is probably because the tool makes it easy for folks to come up with formulas and scripts that get their work done instead of having to wait for an external team to make the changes etc that they need.

Now, a few days ago Microsoft made a surprise announcement that going forward they are going to support running Python inside an Excel file. Yikes!! In order to use this functionality you will need to be part of the Microsoft 365 Insider program and then you can type Python code directly into cells using the new =PY() function, which then gets executed in the cloud. From what I have read, this will be enabled by default and needs to be disabled via a registry key.

Since its inception, Microsoft Excel has changed how people organize, analyze, and visualize their data, providing a basis for decision-making for the millions of people who use it each day. Today we’re announcing a significant evolution in the analytical capabilities available within Excel by releasing a Public Preview of Python in Excel. Python in Excel makes it possible to natively combine Python and Excel analytics within the same workbook – with no setup required. With Python in Excel, you can type Python directly into a cell, the Python calculations run in the Microsoft Cloud, and your results are returned to the worksheet, including plots and visualizations.

We already have issues with Excel Macros being used as vectors for malware & viruses, this just opens a whole new front in that war. Now, admins will have to worry about attackers using Python in Excel to infiltrate the organization or to send data outside the org. I can see how it is useful for people working with datasets and MS is adding this functionality to keep up with other tools such as Tableau etc which are more powerful but still I feel that this is a bad move.

Another problem that folks are going to face is that now your Excel sheets have Python programs inside them, how are we supposed to version the code, how is code review done? Basically this code should be going through the standard SDLC (Software Development Life Cycle) process but wouldn’t. We also need to ensure that all changes are reviewed and monitored to protect against insider attacks but the way the system is setup this is going to be extremely difficult (We have already seen that with Macros and Formulas etc).

Lets see how folks address this risk profile.

– Suramya

August 21, 2023

Workaround for paste not working in Whatsapp web on Firefox

Filed under: Computer Software,Knowledgebase — Suramya @ 10:40 PM

In the latest version of Firefox (116.0.3) something changed and due to this if you are using the Web version of Whatsapp you are no longer able to paste anything into the chat window. Ctrl-V, Right-Click -> Paste, nothing works. The issue exists on both Windows and Linux builds of Firefox. Did a search and found that others are facing the same issue and found a work around at Superuser.com: Firefox doesn’t allow to paste into WhatsApp web anymore?

In order to fix follow these steps:

  • Go to about:config
  • Search for ‘dom.event.clipboardevents.enabled’ (without the quotes)
  • Set the value to false by double clicking on the entry

The change takes effect immediately. This is not a permanent fix and once Firefox fixes the issue you should revert the change.

– Suramya

June 28, 2023

Please stop shoving ChatGPT Integration into products that don’t need it

I am getting really tired of folks shoving ChatGPT integration into everything whether it makes sense or not. The latest silliness is an electric bike with ChatGPT integration. I understand the desire to integrate GPS/Maps etc in a bike, although personally I would rather use an independent device which would get updates more frequently than the built in GPS where the maps might get updated a few times a year. Unless the maps are getting downloaded live using 3G/4G/whatever. I even understand the desire to integrate voice recognition in the setup so that the user can talk to it. But why on earth do I want/need to have ChatGPT shoved in there?

Based on ChatGPT’s well known tendency to hallucinate there is a good probability that it might decide that you should take a path that is not safe or even dump you into the ocean because it hallucinated that it was the way to go. This is the same thing we saw with Blockchain a few years ago, everything was suddenly on the Blockchain whether it needed to be or not. The sad part is that these folks are going to make a ton of money because of the hype behind ChatGPT and then bail leaving the consumers with a sub-par bike that hallucinates.

Source: Urtopia Unveils the World’s First Smart E-Bike with ChatGPT Integration at EUROBIKE 2023

– Suramya

February 27, 2023

It is now possible to put undetectable Backdoors in Machine Learning Models

Filed under: Computer Software,Emerging Tech,My Thoughts,Tech Related — Suramya @ 10:18 PM

Machine Learning (ML) has become the new go to buzzword in the Tech world in the last few years and everyone seems to be focusing on how they can include ML/AI in their products, regardless of whether it makes sense to include or not. One of the bigest dangers of this trend is that we are moving towards a future where an algorithm would have the power to make decisions that have real world impacts but due to the complexity it would be impossible to audit/check the system for errors/bugs, non-obvious biases or signs of manipulation etc. For example, we have had cases where the wrong person was identified as a fugitive and arrested because an AI/ML system claimed that they matched the suspect. Others have used ML to try to predict crimes with really low accuracy but people take it as gospel because the computer said so…

With ML models becoming more and more popular there is also more research on how these models are vulnerable to attacks. In December 2022 researchers (Shafi Goldwasser, Michael P. Kim, Vinod Vaikuntanathan and Or Zamir) from UC Berkely, MIT and Princeton published a paper titled “Planting Undetectable Backdoors in Machine Learning Models” in the IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS) where they discuss how it would be possible to train a model in a way that it allowed an attacker to manipulate the results without being detected by any computationally-bounded observer.

Abstract: Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. Delegation of learning has clear benefits, and at the same time raises serious concerns of trust. This work studies possible abuses of power by untrusted learners.We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate “backdoor key,” the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees.

First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given query access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Moreover, even if the distinguisher can request backdoored inputs of its choice, they cannot backdoor a new input­a property we call non-replicability.

Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm (Rahimi, Recht; NeurIPS 2007). In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is “clean” or contains a backdoor. The backdooring algorithm executes the RFF algorithm faithfully on the given training data, tampering only with its random coins. We prove this strong guarantee under the hardness of the Continuous Learning With Errors problem (Bruna, Regev, Song, Tang; STOC 2021). We show a similar white-box undetectable backdoor for random ReLU networks based on the hardness of Sparse PCA (Berthet, Rigollet; COLT 2013).

Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, by constructing undetectable backdoor for an “adversarially-robust” learning algorithm, we can produce a classifier that is indistinguishable from a robust classifier, but where every input has an adversarial example! In this way, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness.

Basically they are talking about having a ML model that works correctly most of the time but allows the attacker to manipulate the results if they want. One example use case would be something like the following: A bank uses a ML model to decide if they should give out a loan to an applicant and because they don’t want to be accused of being discriminatory they give it to folks to test and validate and the model comes back clean. However, unknown to the testers the model has been backdoored using the techniques in the paper above so the bank can modify the output in certain cases to deny the loan application even though they would have qualified. Since the model was tested and ‘proven’ to be without bias they are in the clear as the backdoor is pretty much undetectable.

Another possible attack vector is that a nation state funds a company that trains ML models and has them insert a covert backdoor in the model, then they have the ability to manipulate the output from the model without any trace. Imagine if this model was used to predict if the nation state was going to attack or not. Even if they were going to attack they could use the backdoor to fool the target into thinking that all was well.

Having a black box making such decisions is what I would call a “Bad Idea”. At least with the old (non-ML) algorithms we could audit the code to see if there were issues with ML that is not really possible and thus this becomes a bigger threat. There are a million other such scenarios that could be played and if we put blind trust in an AI/ML system then we are setting ourselves up for a disaster that we would never see coming.

Source: Schneier on Security: Putting Undetectable Backdoors in Machine Learning Models

– Suramya

February 21, 2023

Fixing problems with nvidia-driver on Debian Unstable after latest upgrade

Filed under: Computer Software,Linux/Unix Related,Tech Related — Suramya @ 10:54 PM

Earlier today I ran my periodic update of my main desktop that is running Debian Unstable. The upgrade finished successfully and since a new kernel was released with this update I restarted the system to ensure that all files/services etc are running the same version. After the reboot the GUI refused to start and I thought the problem could be because of a NVIDIA kernel module issue so I tried to reboot to an older kernel but that didn’t work either. Then I tried running apt-get dist-upgrade again which gave me the following error:

root@StarKnight:~# apt-get dist-upgrade 
Reading package lists...
Building dependency tree...
Reading state information...
You might want to run 'apt --fix-broken install' to correct these.
The following packages have unmet dependencies:
 nvidia-driver : Depends: nvidia-kernel-dkms (= 525.85.12-1) but 515.86.01-1 is installed or
                          nvidia-kernel-525.85.12 or
                          nvidia-open-kernel-525.85.12 or
                          nvidia-open-kernel-525.85.12
E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).

So I ran the apt –fix-broken install command as recommended and that failed as well with another set of errors:

root@StarKnight:/var/log# apt --fix-broken install
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Correcting dependencies... Done
0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
dpkg: dependency problems prevent configuration of nvidia-driver:
 nvidia-driver depends on nvidia-kernel-dkms (= 525.85.12-1) | nvidia-kernel-525.85.12 | nvidia-open-kernel-525.85.12 | nvidia-open-kernel-525.85.12; however:
  Version of nvidia-kernel-dkms on system is 515.86.01-1.
  Package nvidia-kernel-525.85.12 is not installed.
  Package nvidia-open-kernel-525.85.12 is not installed.
  Package nvidia-open-kernel-525.85.12 is not installed.

dpkg: error processing package nvidia-driver (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 nvidia-driver
E: Sub-process /usr/bin/dpkg returned an error code (1)

Looking at the logs, I didn’t see any major errors but I did see the following message:

2023-02-21T19:48:27.668268+05:30 StarKnight kernel: [    3.379006] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  515.86.01  Wed Oct 26 09:12:38 UTC 2022
2023-02-21T19:48:27.668286+05:30 StarKnight kernel: [    4.821755] NVRM: API mismatch: the client has the version 525.85.12, but
2023-02-21T19:48:27.668287+05:30 StarKnight kernel: [    4.821755] NVRM: this kernel module has the version 515.86.01.  Please
2023-02-21T19:48:27.668287+05:30 StarKnight kernel: [    4.821755] NVRM: make sure that this kernel module and all NVIDIA driver
2023-02-21T19:48:27.668288+05:30 StarKnight kernel: [    4.821755] NVRM: components have the same version.

Searching on the web didn’t give me a solution but since I am running the Debian Unstable branch it is expected that once in a while things might break and sometimes they break quite spectacularly… So I started experimenting and tried removing and reinstalling the nvidia-driver but that was failing as well because the package was expecting nvidia-kernel-dkms version 525.85.12 but we had 515.86.01-1 installed.

root@StarKnight:~# apt-get install nvidia-driver
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  nvidia-driver
0 upgraded, 1 newly installed, 0 to remove and 14 not upgraded.
Need to get 0 B/494 kB of archives.
After this operation, 1,398 kB of additional disk space will be used.
Selecting previously unselected package nvidia-driver.
(Reading database ... 439287 files and directories currently installed.)
Preparing to unpack .../nvidia-driver_525.85.12-1_amd64.deb ...
Unpacking nvidia-driver (525.85.12-1) ...
dpkg: dependency problems prevent configuration of nvidia-driver:
 nvidia-driver depends on nvidia-kernel-dkms (= 525.85.12-1) | nvidia-kernel-525.85.12 | nvidia-open-kernel-525.85.12 | nvidia-open-kernel-525.85.12; however:
  Version of nvidia-kernel-dkms on system is 515.86.01-1.
  Package nvidia-kernel-525.85.12 is not installed.
  Package nvidia-open-kernel-525.85.12 is not installed.
  Package nvidia-open-kernel-525.85.12 is not installed.

Now I had a couple of options, first was to wait for a couple of days (if I am lucky) for someone to upload the correct versions of the packages to the channel. The second option was to remove the package and installed the Open Source version of the Nvidia driver. I didn’t want to do that because that package is a memory hog and doesn’t work that well either. The last option was to try to manually install the older version (525.85.12) of the nvidia-kernel-dkms package and this is what I decided to go with, a search on the Debian Packages site gave me the .deb file for nvidia-kernel-dkms and firmware-nvidia-gsp (a dependency for the dkms package). I downloaded both the packages and installed them using the following command:

root@StarKnight:/home/suramya/Media/Downloads# dpkg -i firmware-nvidia-gsp_525.85.12-1_amd64.deb 
root@StarKnight:/home/suramya/Media/Downloads# dpkg -i nvidia-kernel-dkms_525.85.12-1_amd64.deb 

Once the packages were successfully downgraded I rebooted the system and the GUI came up without issues post the reboot.

Moral of the story is that you need to be prepared to have to troubleshoot your setup if you are running Debian Unstable or Debian Testing on your system. If you don’t want to do that then you should stick to Debian Stable which is rock solid or one of the other distributions such as Ubuntu or Linux Mint etc.

– Suramya

January 21, 2023

Fixing AssertionError: Font Arial,Bold can not represent ‘E’ when using Borb to modify PDF Files

Filed under: Computer Software,Knowledgebase,Tech Related — Suramya @ 12:47 AM

I have a bunch of PDF files that I need to modify to remove text from them. Initially I was using LibreDraw but that was a manual task so I thought that I should script it/Automate it. Little did I know that programmatically editing PDF’s is not that simple. I tried a bunch of libraries such as PyPDF4, pikepdf etc but the only one which worked was borb which is a library by Joris Schellekens. They have a great collection of examples and using that I got my first script that searched and replaced text in the PDF working.

However, when I tried to run the script against my pdf file the script fails with the following error:

Traceback (most recent call last):
  File "/home/suramya/Temp/BorbReplace.py", line 26, in 
    main()
  File "/home/suramya/Temp/BorbReplace.py", line 18, in main
    doc = SimpleFindReplace.sub("Manual", "", doc)
  File "/usr/local/lib/python3.10/dist-packages/borb/toolkit/text/simple_find_replace.py", line 80, in sub
    page.apply_redact_annotations()
  File "/usr/local/lib/python3.10/dist-packages/borb/pdf/page/page.py", line 271, in apply_redact_annotations
    .read(io.BytesIO(self["Contents"]["DecodedBytes"]), [])
  File "/usr/local/lib/python3.10/dist-packages/borb/pdf/canvas/canvas_stream_processor.py", line 290, in read
    raise e
  File "/usr/local/lib/python3.10/dist-packages/borb/pdf/canvas/canvas_stream_processor.py", line 284, in read
    operator.invoke(self, operands, event_listeners)
  File "/usr/local/lib/python3.10/dist-packages/borb/pdf/canvas/redacted_canvas_stream_processor.py", line 271, in invoke
    self._write_chunk_of_text(
  File "/usr/local/lib/python3.10/dist-packages/borb/pdf/canvas/redacted_canvas_stream_processor.py", line 203, in _write_chunk_of_text
    )._write_text_bytes()
  File "/usr/local/lib/python3.10/dist-packages/borb/pdf/canvas/layout/text/chunk_of_text.py", line 145, in _write_text_bytes
    return self._write_text_bytes_in_hex()
  File "/usr/local/lib/python3.10/dist-packages/borb/pdf/canvas/layout/text/chunk_of_text.py", line 160, in _write_text_bytes_in_hex
    assert cid is not None, "Font %s can not represent '%s'" % (
AssertionError: Font Arial,Bold can not represent 'E'

Process finished with exit code 1

I tried a couple of different files and the font name changes but the error remains

The script I was using is:

from borb.pdf import Document
from borb.pdf import PDF
from borb.toolkit import SimpleFindReplace

import typing

def main():

    # attempt to read a PDF
    doc: typing.Optional[Document] = None
    with open("/home/suramya/Downloads/t/MAA1.pdf", "rb") as pdf_file_handle:
        doc = PDF.loads(pdf_file_handle)

    # check whether we actually read a PDF
    assert doc is not None

    # find/replace
    doc = SimpleFindReplace.sub("PRIVATE", "XXXX", doc)

    # store
    with open("/home/suramya/Downloads/t/MAABLR_out.pdf", "wb") as pdf_file_handle:
        PDF.dumps(pdf_file_handle, doc)


if __name__ == "__main__":
    main()

I searched on the web and didn’t find any solutions so I reached out to the project owner and they responded with the following message “Not every font can represent every possible character in every language. you are trying to insert a piece of text that contains a character that Arial can not represent. Maybe some weird kind of “E” (since uppercase E should not be a problem).”. The problem was that I wasn’t trying to replace any strange characters, just a normal uppercase E.

To help trouble shoot, they asked me for a copy of the file. So I was masking the data in the PDF file to share it and the script suddenly started working. Turns out that there was an extra space after the word PRIVATE in the file and when I removed it things started working (even on the unmasked file). So it looks like the issue is caused when there is an encoding issue with the PDF file. Opening it in Libre Draw and exporting as a new PDF file seems to resolve the issue.

Now we are a step closer to the solution, I just need to figure out how to convert the file from the command line and I will be home free. Something to work on when I have had some sleep.

– Suramya

January 6, 2023

Good developers need to be able to communicate and collaborate and those are not euphemisms for politics and org building

Filed under: Computer Software,My Thoughts,Tech Related — Suramya @ 11:25 PM

Saw this gem in my Twitter feed a little while ago and had to save it so that I could comment on it.

Twitter screenshot stating: Because to some people, in order to be a senior software engineer it's about politics and org building (perhaps you'll hear euphemisms communication and collaboration)
Because to some people, in order to be a senior software engineer it’s about politics and org building (perhaps you’ll hear euphemisms communication and collaboration)

There is a constant theme in Programming that the good developers are anti-social, can’t be bothered to collaborate and should be left alone so that they can create a perfect product. The so called 10x developer. This is emphasized by movie stories about the genius developer creating something awesome sitting in their basement. Unfortunately that is not how real life works as this 10x developer is a myth. In real life you need to be able to communicate, collaborate and work in a team in order to be successful as a programmer. No single person can create an enterprise level software alone and even if you could it needs to be something that people want/need, so guess what you will have to talk to your users to understand what problems they are facing and then work on software that will fix them or make their lives easier.

In one of my previous company, my role was to look at new software/systems and bring them into the company. So we went to expos, talked to startups and explored the market and found a really cool software that we thought would be extremely useful for the business so we went back and pitched it to the business. To our shock no one was interested in adopting the software because it didn’t address any of the pain points that the business was facing. We thought it would be useful for them because we were looking at it from the outside and hadn’t bothered talking to them about what their pain points were. Then we sat down with the business and their development teams to understand the setup and find out what are the most urgent/painful problems that we should fix. After multiple discussions we went out and found a software that addressed a significant pain point for the business and as soon as we demo’d it, we were asked to expedite getting it validated/approved for installed in their org.

Similarly, one of the startups I was working with during the same time were creating tech to help blind people and I happened to mention that to the founder of a NGO (Non-Government Organization) that works with blind people and his response was that what they are creating is cool but I wish they would actually talk to some blind people before they start working on tech to help them, as the blind people don’t want systems that will give them sight but rather assist them in doing things without trying to recreate sight.

Coming back to the original point about Senior Software Engineer, it is not their job to work on every part of the project themselves. Their job is to look at the high level goal, design the architecture and work with other developers in their team to create the software. Another major task of the senior Software Engineer is to mentor their juniors, teach them the tricks of the trade and help them grow in their skills and role. I personally believe that I should always be training the people under me so that they can one day replace me so that I can move on to more interesting projects. If you make yourselves indispensable in your current role and no one can replace you then you will always be doing the same thing and can never move on. Yes, there is a risk that you might be replaced with a junior and get fired but that can even happen to the 10x developer as well. Personally, I would rather have 10 regular developers than a single 10x developer as they are a pain to work with. They will insist on having full control of the entire dev process will refuse to share information that other developers/database/network folks need and basically become a bottle-neck for the entire project.

The way I look at being a senior engineer/architect is that I get to work on the really interesting problems, write code for PoC’s (Proof of Concept) that fix the problem. Then I can handoff the code to others who can productionalize it with me providing guidance and support. Its not to say that I wouldn’t get my hands dirty productionalizing the system but I rather solve interesting problems.

Another myth is that the only person who knows the system will never get fired. I have taken over multiple systems over the years (at least 4 that I can recall for sure) where they were originally managed by a single person who refused to collaborate/communicate with the rest of the team. In some cases they were fired and I was asked to take over, in others they were moved to other non-critical projects so they stopped being a road block. It each case took us a lot of time to reverse engineer/understand the system but it was worth the effort to do that so that we could make future changes without fighting with someone for every change or having to call the person for information everytime the system gave problems.

Long story short: communications doesn’t equate politics and collaboration doesn’t equate org building. If you think that they do then you will be miserable in any mid to large size company. You might get away with it in a startup initially but not for long as the team grows you will be expected to work together with other developers/admins (collaborate) to create systems that others want and for that you will need to communicate with others to ensure what you are making is actually useful.

Well this is all for now. Will write more later.

– Suramya

December 1, 2022

Analysis of the claim that China/Huawei is remotely deleting videos of recent Chinese protests from Huawei phones

Filed under: Computer Hardware,Computer Software,My Thoughts,Tech Related — Suramya @ 2:23 AM

There is an interesting piece of news that is slowly spreading over the internet in the past few hours where Melissa Chen is claiming over at Twitter that Huawei phones are automatically deleting videos of the protests that took place in China, without notifying their owners. Interestingly I was not able to find any other source reporting this issue. All references/reports of this issue are linking back to this tweet and based on this single tweet that is not supported by external validation. Plus the tweet does not even provide enough information to validate that this is happening other than a single video shared as part of the original tweet.


Melissa Chen claiming on Twitter that videos of protests are being automatically deleted by Huawei without notification

However, it is an interesting exercise to think how this could have been accomplished, what the technical requirements for this to work would look like and if this is something that would happen. So lets go ahead and dig in. In order to delete a video remotely, we would need the following:

  • The capability to identify the videos that need to be deleted without impacting other videos/photos on the device
  • The capability to issue commands to the device remotely that all sensitive videos from xyz location taken at abc time need to be nuked and Monitor the success/failure of the commands
  • Identify the devices that need to have the data on the looked at. Keeping in mind that the device could have been in airplane mode during the filming

Now, lets look at how each of these could be accomplished one at a time.

The capability to identify the videos that need to be deleted without impacting other videos/photos on the device

There are a few ways that we can identify the videos/photos to be deleted. If it was a video from a single source then we could have used a HASH value of the video to identify it and then delete. Unfortunately in this case the video in question is recorded by the device so each video file will have a separate hash value so this is not how we could do this.

The second option is to use the Metadata in the file, to identify the date & time along with the physical location of the video to be deleted. If videos were recorded within a geo-fence area in a specific timeframe then we potentially have the information required to identify the videos in question. The main problem would be that the user could have disabled geo-tagging of photos/videos taken by the phone or the date/time stamp might be incorrect.

One way to bypass this attempt to save the video would be to have the app/phone create a separate geo-location record of every photo/video taken by the device even when GPS is disabled or Geo tagging is disabled. This would require a lot of changes in the OS/App file and since a lot of people have been looking at the code in Huawei phones for issues ever since there was an accusation that they are being used by China to spy on western world, it is hard to imagine this would have escaped from scrutiny.

If the app was saving the data in the video/photo itself rather than a separate location then it should be easy enough to validate by examining the image/video data of photos/videos taken by any Huawei phone. But I don’t see any claims/reports that prove that this is happening.

The capability to issue commands to the device remotely that all sensitive videos from xyz location taken at abc time need to be nuked and Monitor the success/failure of the commands

Coming to the second requirement, Huawei or the government would need the capability to remotely activate the functionality to delete the videos. In order to do this the phone would need to be connecting to a Command & Control (C&C) channel frequently to check for commands. Or the phone would have something listening to remote commands from a central server.

Both of these are hard to disguise and hide. Yes, there are ways to hide data in DNS queries and other such methods to cover the tracks but thanks to Botnets, malware and Ransomware campaigns the ability to identify hidden C&C channels is highly developed and it is hard to hide from everyone looking at this. If the phone has something listening to commands then a scan of the device for open ports/apps listening to connections would be an easy thing to check and even if the app listening is disguised it should be possible to identify that something is listening.

You might say that the commands to activate might be hidden in the normal traffic going to & from the device to the Huawei servers and while that is possible we can check for it by installing a root certificate and passing all the traffic to/from the device via a proxy to be analyzed. Not impossible to do but hard to achieve without leaving signs, and considering the scrutiny these phones are going through hard to accept that this is something that is happening without anyone finding out about it.

Identify the devices that need to have the data on the looked at. (Keeping in mind that the device could have been in airplane mode during the filming)

Next, we have the question on how would Huawei identify the devices that need to run the check for videos. One option would be to issue the command to all their phones anywhere in the world. This would potentially be noisy and there is a possibility that a sharp eyed user catches the command in action. So far more likely option would be for them to issue it against a subset of their phones. This subset could be all phones in China, all phones that visited the location in question around the time the protest happened or all phones that are there in or around the location at present.

In order for the system to be able to identify users in an area, they have a few options. One would be to use GPS location tracking which would require the device to constantly track its location and share with a central location. Most phones already do this. One potential problem would be when users disable GPS on the device but other than that this would be an easy request to fulfill. Another option is to use cell tower triangulation to locate/identify the phones in the area at a given time. This is something that is easily done at the provider side and from what I read quite common in China. Naomi Wu AKA RealSexyCyborg had a really interesting thread on this a little while ago that you should check out.

This doesn’t even account for the fact that China has CCTV coverage across most of its jurisdiction and claim to have the ability to run Facial recognition across this massive amount of video collected. So, it is quite easy for the government to identify the phones that need to be checked for sensitive photos/videos with existing & known technology and ability.

Conclusion/Final thoughts

Now also remember that if Huawei had the ability to issue commands to its phones remotely then they also have the ability to extract data from the phones, or plant information on the phone. Which would be a espionage gold mine as people use their phones for everything and have then with them always. Loosing the ability to do this just to delete videos is not something that I feel China/Huawei would do as harm caused by the loss of intelligence data would far outweigh the benefits of deleting the videos. Do you really think that every security agency, Hacker Collective, bored programmers, Antivirus/cybersec firms would not immediately start digging into the firmware/apps on any Huawei phone once it was known and confirmed that they are actively deleting stuff remotely.

So, while it is possible that Huawei/China has the ability to scan and delete files remotely I doubt that this is the case right now. Considering that there is almost no reports of this happening anywhere and no independent verification of the same plus it doesn’t make sense for China to nuke this capability for such a minor return.

Keeping that in mind this post seems more like a joke or fake news to me. That being said, I might be completely mistaken about all this so if you have additional data or counter points to my reasoning above I would love for you to reach out and discuss this is more detail.

– Suramya

November 18, 2022

Twitter Extract: Downloading data not exposed in the Official Data export

Filed under: Computer Software,Software Releases — Suramya @ 2:46 PM

It looks like Twitter is imploding and even though I don’t think that it will go down permanently it seemed like a good time to export data so that I have a local backup available. Usually I just ask Twitter to export my data but this time I needed additional data as I was preparing a backup that would work even when Twitter was down completely. The Twitter Export didn’t give me all the data I wanted, specifically I needed an export of the following which wasn’t there in the official export:

  • Owned Lists (Including Followers and Members of the list)
  • Subscribed Lists with Followers and Members of the List
  • List of all Followers (ScreenName, Fullname and ID)
  • List of all Following (ScreenName, Funnname and ID)

So I created a script that exports the above. It is available for download at: Github: TwitterExtract. Instructions for installation and running are there in the ReadMe file.

This was created as a quick and dirty solution so it is not productionalized (i.e. it doesn’t have a lot of error checking, hardening etc) but it does what it is supposed to do. Check it out and let me know what you think. Bug Fixes and additional features are welcome…

– Suramya

« Newer PostsOlder Posts »

Powered by WordPress