Suramya's Blog : Welcome to my crazy life…

July 27, 2020

Cloaking your Digital Image using Fawkes to thwart unauthorized Deep Learning Models

Filed under: Computer Related,Computer Software,My Thoughts,Tech Related — Suramya @ 3:42 PM

Unless you have been living under a rock you have seen or heard about facial recognition technologies that are actively in use in the world. You have the movie/TV version where a still image from a video feed is instantly compared to every image in the database to match a perp, then you have the real world example where there are systems that take all your social media feeds, images of yours posted anywhere as a dataset to train a system that can identify you from a video feed (not as quickly as the TV version but still fast).

So what is the way to prevent this? Unfortunately there isn’t one (or at least there wasn’t a realistic one till recently). Earlier you had to ensure that no image of yours is ever posted online, you are never caught in a security feed or traffic cam anywhere. Which as you can imagine is pretty impossible in today’s connected world. Even if I don’t post a picture of me online, my friends with whom I attended a party might upload a pic with me in the background and tag me. Or you get peer pressured to upload the photos to FB or Twitter etc.

There is not much we can do about state sponsored learning models but there are plenty of other folks running unauthorized setups that consume photos posted publicly without permission to train their AI models. These are the systems targeted by folks from the SAND Lab at University of Chicago who have developed Fawkes1, an algorithm and software tool (running locally on your computer) that gives individuals the ability to limit how their own images can be used to track them.

At a high level, Fawkes takes your personal images, and makes tiny, pixel-level changes to them that are invisible to the human eye, in a process we call image cloaking. You can then use these “cloaked” photos as you normally would, sharing them on social media, sending them to friends, printing them or displaying them on digital devices, the same way you would any other photo. The difference, however, is that if and when someone tries to use these photos to build a facial recognition model, “cloaked” images will teach the model an highly distorted version of what makes you look like you. The cloak effect is not easily detectable, and will not cause errors in model training. However, when someone tries to identify you using an unaltered image of you (e.g. a photo taken in public), and tries to identify you, they will fail.

The research and the tool will be presented at the upcoming USENIX Security Symposium, to be held on August 12 to 14. The software is available for download at the projects GitHub repository and they welcome contributions.

It would be amazing when this tool matures and I can imagine it becoming a default part of operating systems so that all images uploaded get processed by the tool by default reducing the risk of automatic facial recognition. Although I can’t imagine any of the governments/Facebook being too happy about this tool being publicly available. 🙂

Well this is all for now. Will write more later.

Thanks to Schneier on Security for the initial link.

– Suramya

March 4, 2020

Seti@home project to stop distributing new work to clients after 21 years

Filed under: Computer Software,News/Articles,Tech Related — Suramya @ 1:44 PM

Seti@home has a fond place in my heart. It has been run by the Berkeley SETI Research Center since 1999, and I think I installed it on my machine sometime around Dec 1999 or early 2000 after hearing about it from one of the News websites (possibly Slashdot). Once I had it running on my system I was pushing to get it installed on all the computers in the University computer lab as they were running 24/7 and I saw that as a wasted opportunity for furthering search for ET. I ran it constantly on my computers till about 2009 post which I switched to running Folding@home which is more focused on Science / DNA sequencing / Medical research. Seti was one of the first Distributed computing systems that I know of and the amount of data processed by computers under its umbrella is staggering.

On March 31, the project will stop sending out new work units to users and the project will instead start focusing on analyzing all the blips identified by volunteers’ machines which could be potential evidence of aliens with an eye on publishing a research paper. Once this is completed they might start pushing out work packages again but that will be a while before it happens.

“It’s a lot of work for us to manage the distributed processing of data. We need to focus on completing the back-end analysis of the results we already have, and writing this up in a scientific journal paper,” their news announcement stated.

Looking forward to reading the research paper and conclusions generated by the Seti@home program.

Source: SETI@home Search for Alien Life Project Shuts Down After 21 Years

– Suramya

March 2, 2020

Another magical AI to detect “Inappropriate photos” and block kids from taking them

Filed under: Computer Software,My Thoughts,News/Articles,Tech Related — Suramya @ 11:50 AM

In today’s iteration of people who don’t want to make the effort of raising their kids and explaining the difference between right & wrong and why something might be a bad idea we have a new “magical” AI that will automatically detect when kids are making a bad choice and stop them. I mean why should a parent make an effort to talk to their kids and help them understand what repercussions of a given choice could be wrong when you have AI to make the effort for them? This new AI is being pitched to parents and has an AI-powered “Smartphone Protection” feature that prevents users from shooting or saving “inappropriate” photos (read: naked pictures).

The official Tone Mobile press release hails the TONE e20 as the world’s first phone with an AI that “regulates inappropriate images” through an AI built into the so-called TONE Camera… If the AI recognizes that the subject of a photo is “inappropriate,” the camera will lock up; and if you somehow manage to snap a photo before the AI kicks in, the phone won’t let you save or share it.

Additionally, a feature called “TONE Family” can be set to send an alert to parents whenever an inappropriate image is detected. According to SoraNews24, this alert will contain location data and a pixelated thumbnail of the photo in question.

I give it about 24 hours from when the phone is released till folks figure out a way around it.

The other issue I have with this system is how its going to classify the pics. The article doesn’t go into technical details of how the AI works and if the classification is done locally or on the cloud. If its on the cloud then every pic taken by that phone is being uploaded to a remote server owned by a 3rd party. This is a massive risk and any breach of that server is going to have a lasting and significant impact. Trust me when I say that this server would be a target of all Black Hat hackers as soon as it goes online.

I am not going to go into whether taking nude pics is a good idea or not. Its upto the people involved to take that decision, I am not responsible for what you do with your phone. If you have to take naughty pics just ensure you follow basic rules and don’t share it with anyone you don’t trust 100%.

In summary, Dear parents: Instead of offloading your responsibilities to AI try having a frank and open conversation with your kids about why certain things might be a bad idea. It will give you better results than this snakeoil.

Source: Slashdot.org

– Suramya

September 3, 2019

AI Emotion-Detection Arms Race starts with AI created to mask emotions

Filed under: Computer Software,My Thoughts,Tech Related — Suramya @ 2:31 PM

Over the past few months/years we have been reading a lot about AI being used to identify emotions like fear, confusion and even traits like lying or trustworthiness of a person by analyzing video & audio recordings. This is driving innovations in Recruiting, Criminal investigations etc. In fact the global emotion detection and recognition market is estimated to witness a compound annual growth rate of 32.7% between 2018 – 2023, driving the market to reach USD 24.74 billion by 2020. So a lot of companies are focusing their efforts in this space as AI applications that are emotionally aware give a more realistic experience for users. However, there are multiple privacy implications of having a system detect a person’s emotional state when interacting with an online system.

So to counter this trend of systems becoming more and more aware there is now a group of researchers who have come up with an AI-based countermeasure to mask emotion in spoken words, kicking off an arms race between the two factions. The idea is to automatically converting emotional speech into “normal” speech using AI.

Their method for masking emotion involves collecting speech, analyzing it, and extracting emotional features from the raw signal. Next, an AI program trains on this signal and replaces the emotional indicators in speech, flattening them. Finally, a voice synthesizer re-generates the normalized speech using the AIs outputs, which gets sent to the cloud. The researchers say that this method reduced emotional identification by 96 percent in an experiment, although speech recognition accuracy decreased, with a word error rate of 35 percent.

In a way its quite cool because it removes a potential privacy issue, but if you extrapolate from existing research then we have the potential of bigger headaches in the future. Currently we have the capability of removing emotion from a audio recording, how difficult would it be to add emotion to a recording? Not too difficult if you go through the ongoing research. So, now we have a system that can take a audio/video recording and change the emotion from sadness to mocking or from happy to sad. This combined with the deepfakes apps that are already there in the market will cause huge headaches for the public as it would be really hard for us to determine if a given audio/video is authentic or altered.

Article: Researchers Created AI That Hides Your Emotions From Other AI
Paper: Emotionless: Privacy-Preserving Speech Analysis for Voice Assistants

Well this is all for now. Will write more later.

– Suramya

July 17, 2019

Using Machine Learning To Automatically Translate Long-Lost Languages

Filed under: Computer Software,Interesting Sites,My Thoughts,Tech Related — Suramya @ 1:25 PM

Machine Learning has become such a buzz word that any new product or research being released nowadays has to mention ML in it somewhere even though they have nothing to do with it. But this particular usecase is actually very interesting and I am looking forward to more advances in this front. Researchers Jiaming Luo and Regina Barzilay from MIT and Yuan Cao from Google’s AI lab in Mountain View, California have created a machine-learning system capable of deciphering lost languages.

Normally Machine translation programs work by mapping out how words in a given language are related to each other. This is done by processing large amounts of text in the language and creating vector maps on how often each word appears next to every other word for both source and target languages. Unfortunately, this requires a large dataset (text) in the language and that is not possible in case of lost languages, and that’s where the brilliance of this new technique comes in. Focusing on the fact that when languages evolve over time they can only change in certain ways (e.g. related words have the same order of characters etc) they came up with a ruleset for deciphering a language when the parent or child of the language being translated is known.

To test out their theory/process they tried it out with two lost languages, Linear B and Ugaritic. Linguists know that Linear B encodes an early version of ancient Greek and that Ugaritic, which was discovered in 1929, is an early form of Hebrew. After processing the system was able to correctly translate 67.3% of Linear B into their Greek equivalents which is a remarkable achievement and marks a first in the field.

There are still some restrictions with the new algorithm in that it doesn’t work if the progenitor language is not known. But work on the system is ongoing and who knows some new breakthrough might be just around the corner. Plus there is always a brute force approach where the system tries translating a given language using every possible language as the progenitor language. It would require a lot of compute and time but is something to look at as an option.

Well, this is all for now. Will write more later.

– Suramya

Source: Machine learning has been used to automatically translate long-lost languages

May 27, 2019

Microsoft and Brilliant launch Online Quantum Computing Class that actually looks useful

Quantum computing (QC) is the next big thing and everyone is eager to jump on the bandwagon. So my email & news feeds are usually flooded with articles on how QC will solve all my problems. I don’t deny that there are some very interesting usecases out there that would benefit from Quantum Computers but after a while it gets tiring. That being said I just found out that Microsoft & Brilliant have launched a new interactive course on Quantum Computing that allows you to build quantum algorithms from the ground up with a quantum computer simulated in your browser and I feel its pretty cool and a great initiative. The tutorial enables you to learn Q# which is Microsoft’s answer to the question of which language to use for Quantum computing code. Check it out if you are interested in learning how to code in Q#.

The course starts with basic concepts and gradually introduces you to Microsoft’s Q# language, teaching you how to write ‘simple’ quantum algorithms before moving on to truly complicated scenarios. You can handle everything on the web (including quantum circuit puzzles) and the course’s web page promises that by the end of the course, “you’ll know your way around the world of quantum information, have experimented with the ins and outs of quantum circuits, and have written your first 100 lines of quantum code — while remaining blissfully ignorant about detailed quantum physics.”
Brilliant has more than 8 million students and professionals worldwide learning subjects from algebra to special relativity through guided problem-solving. In partnership with Microsoft’s quantum team, Brilliant has launched an interactive course called “Quantum Computing,” for learning quantum computing and programming in Q#, Microsoft’s new quantum-tuned programming language. The course features Q# programming exercises with Python as the host language (one of our new features!). Brilliant and Microsoft are excited to empower the next generation of quantum computer scientists and engineers and start growing a quantum workforce today.

Starting from scratch

Because quantum computing bridges the fields of information theory, physics, mathematics, and computer science, it can be difficult to know where to begin. Brilliant’s course, integrated with some of Microsoft’s leading quantum development tools, provides self-learners with the tools they need to master quantum computing.
The new quantum computing course starts from scratch and brings students along in a way that suits their schedule and skills. Students can build and simulate simple quantum algorithms on the go or implement advanced quantum algorithms in Q

Once you have gone through the tutorial you should also check out IBM Q that allows you to code on a Quantum computer for free.

– Suramya

August 24, 2018

Fixing the appstreamcli error when running apt-get update

Filed under: Computer Software,Knowledgebase,Linux/Unix Related,Tech Related — Suramya @ 12:05 AM

Over the past few days everytime I tried to update my Debian system using apt-get it would fail with the following error message:

(appstreamcli:5574): GLib-CRITICAL **: 20:49:46.436: g_variant_builder_end: assertion '!GVSB(builder)->uniform_item_types || 
GVSB(builder)->prev_item_type != NULL || g_variant_type_is_definite (GVSB(builder)->type)' failed

(appstreamcli:5574): GLib-CRITICAL **: 20:49:46.436: g_variant_new_variant: assertion 'value != NULL' failed

(appstreamcli:5574): GLib-ERROR **: 20:49:46.436: g_variant_new_parsed: 11-13:invalid GVariant format string
Trace/breakpoint trap
Reading package lists... Done
E: Problem executing scripts APT::Update::Post-Invoke-Success 'if /usr/bin/test -w /var/cache/app-info -a -e /usr/bin/appstreamcli; then appstreamcli refresh-cache > 
/dev/null; fi'
E: Sub-process returned an error code

Spent a couple of hours trying to figure out what was causing it and was able to identify that it was caused because of a bug in appstream as tunning the command manually also failed with the same error. When I tried to remove the package as recommended by a few sites it would have removed the entire KDE desktop from my machine which I didn’t want so I was at a loss as to how to fix the problem. So I put the update on hold till I had a bit more time to research the issue and identify the solution.

Today I got some free time and decided to try again and after a little bit of searching stumbled upon the following Bug Report (#906544) where David explained that the error was caused due to a bug in the upstream version of appstream and a little while later Matthias commented that the issue is fixed in the latest version of the software and it would flow down to the Debian repositories in a little bit. Normally I would have just done an apt-get update and then install to get the latest package but since the whole issue was that I couldn’t get the system to finish the update command I had to manually install the package.

To do that I went to the Debian site and opened the software package list for Debian Unstable (as that is what I am using) and searched for appstream. This gave me a link to the updated package (0.12.2-2) that fixed the bug (I had 0.12.2-1 installed). Once I downloaded the package (Make sure you download the correct package based on your system architecture) I manually installed it using the following command as root:

dpkg -i appstream_0.12.2-2_amd64.deb

This installed the package and I was then able to do an apt-get update successfully. I still get the GLib-CRITICAL warnings but that apparently can be ignored without issues.

Hope this helps people who hit the same issue (or reminds me of the solution if/when I hit the issue again).

– Suramya

August 23, 2018

Identifying Programmers by their Coding Style

Filed under: Computer Security,Computer Software,Tech Related — Suramya @ 8:42 PM

There is an interesting development in the field of identifying people by what they write. As some of you may already know researchers have been able to identify who wrote a particular text based on the analysis of things like word choice, sentence structure, syntax and punctuation using a technique called stylometry for a while now but it was limited to natural languages and not artificial ones like programming languages.

Now there is new research by Rachel Greenstadt & Aylin Caliskan who are professors of computer science at Drexel University & at George Washington University respectively that proves that code, like other forms of writing is not anonymous. They used Machine Learning algorithms to de-anonymize coders and the really cool part is that they can do this even with reverse compiled code from Binaries with a reasonable level of confidence. So you don’t need access to the original source code to be able to identify who coded it. (Assuming that we have code samples from them in the training DB)

Here’s a simple explanation of how the researchers used machine learning to uncover who authored a piece of code. First, the algorithm they designed identifies all the features found in a selection of code samples. That’s a lot of different characteristics. Think of every aspect that exists in natural language: There’s the words you choose, which way you put them together, sentence length, and so on. Greenstadt and Caliskan then narrowed the features to only include the ones that actually distinguish developers from each other, trimming the list from hundreds of thousands to around 50 or so.

The researchers don’t rely on low-level features, like how code was formatted. Instead, they create “abstract syntax trees,” which reflect code’s underlying structure, rather than its arbitrary components. Their technique is akin to prioritizing someone’s sentence structure, instead of whether they indent each line in a paragraph.

This is both really cool and a bit scary because suddenly we have the ability to identify who wrote a particular piece of code. This removes or atleast reduces the ability of people to release code/software anonymously. This is a good thing when we look at a piece of Malware or virus because now we can find out who wrote it making it easier to prosecute cyber criminals.

However the flip side is that we can now also identify people who write code to secure networks, bypass restrictive regime firewalls, create privacy applications etc. There are a lot of people who contribute to opensource software but don’t want to be identified for various reasons. For example if a programmer in China created a software that allows a user to bypass the Great Firewall of China they would definitely not want the Chinese government to be able to identify them for obvious reasons. Similarly there are folks who wrote some software that they do not want to be associated with their real name for some reason and this would make it more difficult for them to do so.

But this is not the end of the world, there are ways around this by using software to scramble the code. I don’t think many such systems exist right now or if they do they are at a nacent stage. If this research is broadly applied to start identifying coders then the effort to write such scramblers would take high priority and lots of very smart people would start focusing their efforts to invalidate the detectors.

Well this is all for now. Will write more later.

– Suramya

Original source: Schneier’s Blog

September 27, 2016

How to install Tomato Firmware on Asus RT-N53 Router

Filed under: Computer Software,Knowledgebase,Tech Related,Tutorials — Suramya @ 11:43 PM

I know I am supposed to blog about the all the trips I took but wanted to get this down before I forget what I did to get the install working. I will post about the trips soon. I promise 🙂

Installing an alternate firmware on my router is something I have been meaning to do for a few years now but never really had the incentive to investigate in detail as the default firmware worked fine for the most part and I didn’t really miss any of the special features I would have gotten with the new firmware.

Yesterday my router decided to start acting funny, basically every time I started transferring large files from my phone to the desktop via sFTP over wifi the entire router would crash after about a min or so. This is something that hasn’t happened before and I have transferred gigs of data so I was stumped. Luckily I had a spare router lying around thanks to dad who forced me to carry it to Bangalore during my last visit. So I swapped the old router with the new one and got my work done. This gave me an opportunity as I had a spare router sitting on my desk and some time to kill so I decided to install a custom firmware on it to play with it.

I was initially planning on installing dd-wrt on it but their site was refusing to let me download the file for the RT-N53 model even though the wiki said that I should be able to install it. A quick web search suggested that folks have had a good experience with the Tomato by Shibby firmware so I downloaded and installed it by following these steps:

Download the firmware file

First we need to download the firmware file from the Tomato Download site.

  • Visit the Tomato download Section
  • Click on the latest Build folder. (I used build5x-138-MultiWAN)
  • Click on ‘Asus RT-Nxx’ folder
  • Download the ‘MAX’ zip file as that has all the functionality. (I used the tomato-K26-1.28.RT-N5x-MIPSR2-138-Max.zip file.)
  • Save the file locally
  • Extract the ZIP file. The file we are interested in is under the ‘image’ folder with a .trx extension

Restart the Router in Maintenance mode

  • Turn off power to router
  • Turn the power back on while holding down the reset button
  • Keep holding reset until the power light starts flashing which will mean router is in recovery mode

Set a Static IP on the Ethernet adapter of your computer

For some reason, you need to set the IP address of the computer you are using to a static IP of 192.168.1.2 with subnet 255.255.255.0 and gateway 192.168.1.1. If you skip this step then the firmware upload fails with an integrity check error.

Upload the new firmware

  • Connect the router to a computer using a LAN cable
  • Visit 192.168.1.1
  • Login as admin/admin
  • Click Advanced Setting from the navigation menu at the left side of your screen.
  • Under the Administration menu, click Firmware Upgrade.
  • In the New Firmware File field, click Browse to locate the new firmware file that you downloaded in the previous step
  • Click Upload. The uploading process takes about 5 minutes.
  • Then unplug the router, wait 30 seconds.
  • Hold down the WPS button while plugging it back in.
  • Wait 30 seconds and release the WPS button.

Now you should be using the new firmware.

  • Browse to 192.168.1.1
  • Login as admin/password (if that doesn’t work try admin/admin)
  • Click on the ‘reset nvram to defaults’ link in the page that comes up. (I had to do this before the system started working but apparently its not always required.)

Configure your new firmware

That’s it, you have a router with a working Tomato install. Go ahead and configure it as per your requirements. All functionality seems to be working for me except the 5GHz network which seems to have disappeared. I will play around with the settings a bit more to see if I can get it to work but as I hardly ever connected to the 5GHz network its not a big deal for me.

References

The following sites and posts helped me complete the install successfully. Without them I would have spent way longer getting things to work:

Well this is it for now. Will post more later.

– Suramya

February 25, 2016

Indian Patent office rejects Software patents

Filed under: Computer Software,My Thoughts,Tech Related — Suramya @ 8:00 PM

As you know software patents are something of a scourge in the computer industry and are hated for the most part (except by the companies using them to make money/stifle innovation and competition). There is extensive debate on the topic all of which boils down to the following three questions:

  • Should software patents even be allowed? If they are then how do we define the boundary between patentable and non-patentable software?
  • Is the inventive step and non-obviousness requirement is applied too loosely to software?
  • Are software patents discouraging innovation instead of encouraging it?

The Indian patent office has ruled on 19th Feb 2016 that software patents discourage innovation by using the following three part test to determine the patentability of Computer Related Inventions (CRIs), which precludes software from being patented:

  • Openly construe the claim and identify the actual contribution;
  • If the contribution lies only in mathematical method, business method or algorithm, deny the claim;
  • If the contribution lies in the field of computer programme, check whether it is claimed in conjunction with a novel hardware and proceed to other steps to determine patentability with respect to the invention.. The computer programme in itself is never patentable. If the contribution lies solely in the computer programme, deny the claim. If the contribution lies in both the computer programme as well as hardware, proceed to other steps of patentability.

This is a great step in ensuring that useless/basic idea’s don’t get patented and stifle innovation.

– Suramya

Source: Press Release: Indian Patent Office Says No to Software Patents

« Newer PostsOlder Posts »

Powered by WordPress