Suramya's Blog : Welcome to my crazy life…

August 29, 2020

You can be identified online based on your browsing history

Filed under: Computer Related,Computer Software,My Thoughts,Tech Related — Suramya @ 7:29 PM

Reliably Identifying people online is a bedrock of the million dollar advertising industry and as more and more users become privacy conscious browsers have been adding features to increase the user’s privacy and reduce the probability of them getting identified online. Users can be identified by Cookies, Super Cookies etc etc. Now there is a research paper (Replication: Why We Still Can’t Browse in Peace: On the Uniqueness and Reidentifiability of Web Browsing Histories) that claims to be able to identify users based on their browsing histories. It is built on top of previous research Why Johnny Can’t Browse in Peace: On the Uniqueness of Web Browsing History Patterns and re-validates the findings of the previous paper and builds on top of it.

We examine the threat to individuals’ privacy based on the feasibility of reidentifying users through distinctive profiles of their browsing history visible to websites and third parties. This work replicates and

extends the 2012 paper Why Johnny Can’t Browse in Peace: On the Uniqueness of Web Browsing History Patterns[48]. The original work demonstrated that browsing profiles are highly distinctive and stable.We reproduce those results and extend the original work to detail the privacy risk posed by the aggregation of browsing histories. Our dataset consists of two weeks of browsing data from ~52,000 Firefox users. Our work replicates the original paper’s core findings by identifying 48,919 distinct browsing profiles, of which 99% are unique. High uniqueness hold seven when histories are truncated to just 100 top sites. Wethen find that for users who visited 50 or more distinct do-mains in the two-week data collection period, ~50% can be reidentified using the top 10k sites. Reidentifiability rose to over 80% for users that browsed 150 or more distinct domains.Finally, we observe numerous third parties pervasive enough to gather web histories sufficient to leverage browsing history as an identifier.

Original paper

Olejnik, Castelluccia, and Janc [48] gathered data in a project aimed at educating users about privacy practices. For the analysis presented in [48] they used the CSS :vis-ited browser vulnerability [8] to determine whether various home pages were in a user’s browsing history. That is, they probed users’ browsers for 6,000 predefined “primary links” such as www.google.com and got a yes/no for whether that home page was in the user’s browsing history. A user may have visited that home page and then cleared their browsing history, in which case they would not register a hit. Additionally a user may have visited a subpage e.g. www.google.com/maps but not www.google.com in which case the probe for www.google.com would also not register a hit. The project website was open for an extended period of time and recorded profiles between January 2009 and May 2011 for 441,627 unique users, some of whom returned for multiple history tests, allowing the researchers to study the evolution of browser profiles as well. With this data, they examined the uniqueness of browsing histories.

This brings to mind a project that I saw a few years ago that would give you a list of websites from the top 1k websites that you had visited in the past using javascript and some script-fu. Unfortunately I can’t find the link to the site right now as I don’t remember the name and a generic search is returning random sites. If I find it I will post it here as it was quite interesting.

Well this is all for now. Will post more later.

– Suramya

August 27, 2020

Optimizing the making of peanut butter and banana sandwich using computer vision and machine learning

Filed under: Computer Related,Computer Software,Tech Related — Suramya @ 12:42 AM

The current Pandemic is forcing people to stay at home depriving them of activities that kept them occupied in the past so people are getting a bit stir-crazy & bored of staying at home. Its worse for developers/engineers as you never know what will come out from the depths of a bored programmer’s mind. Case in point is the effort spent by Ethan Rosenthal in writing Machine Learning/Computer Vision code to Optimizing the coverage of the banana slices on his peanut butter & Banana sandwich so that there is the same amount of banana in every mouthful. The whole exercise took him a few months to complete and he is quite proud of the results.

It’s really quite simple. You take a picture of your banana and bread, pass the image through a deep learning model to locate said items, do some nonlinear curve fitting to the banana, transform to polar coordinates and “slice” the banana along the fitted curve, turn those slices into elliptical polygons, and feed the polygons and bread “box” into a 2D nesting algorithm
[…]
If you were a machine learning model (or my wife), then you would tell me to just cut long rectangular strips along the long axis of the banana, but I’m not a sociopath. If life were simple, then the banana slices would be perfect circles of equal diameter, and we could coast along looking up optimal configurations on packomania. But alas, life is not simple. We’re in the middle of a global pandemic, and banana slices are elliptical with varying size.

The problem of fitting arbitrary polygons (sliced circular banana pieces) in a box (the bread piece) is NP-hard so the ideal solution is practically uncomputable and Rosenthal’s solution is a good approximation of the optimal solution in a reasonable time frame. The final solution is available as a command-line package called “nannernest” which takes a photo of the bread piece & banana as its argument and returns the an optimal slice-and-arrange pattern for the given combination.


Sample output created by nannernest

Check out the code & the full writeup on the project if you are interested. Even though the application is silly it’s a good writeup on using Machine Learning & Computer Vision for a project.

Source: Boing Boing

– Suramya

August 19, 2020

Convert typed text to realistic handwriting

Filed under: Computer Related,Computer Software,Tech Related — Suramya @ 6:45 PM

There are some tools or projects that really don’t make any practical sense but are a lot of fun to use or just impressive in how they implement technology. The Handwritten.js project by ‘alias-rahil’ is one such project. Basically what it does is take any Plain Text document and convert it into a realistic looking handwritten page. I tried it out on a few sample documents (logs) and it worked great. The programs does coredump if you try converting a 5MB file, but other than that it worked as expected.

Below is a sample file with some quotes that I converted as a test :

* Mountain Dew and doughnuts… because breakfast is the most important meal of the day

* Some days you’re the dog; some days you’re the hydrant.

* He who smiles in a crisis has found someone to blame.

* Marriage is one of the chief causes of divorce

* Earth is 98% full…please delete anyone you can.

* I came, I saw, I decided to order take out.

* F U CN RD THS U CNT SPL WRTH A DM!

* Work hard for eight hours a day, and eventually you may become a
boss and be able to work twelve.

* Quitters never win, and winners never quit, but those who never quit AND never win are idiots.

* What’s the difference between a bad golfer and a bad skydiver?

A bad golfer goes, WHACK! “Damn.”
A bad skydiver goes, “Damn.” WHACK!

* Beware of the light at the end of the tunnel. It could be an oncoming train.

* A girl is like a road. The more curves she has the more dangerous she is!

* A woman who dresses to kill probably cooks the same.

The script is fast and didn’t take more than a few seconds to process the file and create a PDF file with the output. The output for my test run is as below:


Output generated by Handwritten.js

I did also try converting a word file with the software but it didn’t take the content of the file for the conversion, instead it converted the XML & Code from the file. One suggestion for improvement I have is to enhance the script to support word files. It would be awesome if it could also convert any of the diagrams, tables etc to look like they were drawn by hand.

Maybe if I have some time I will look into this and see how easy it is to enhance the script. But no promises as I have a ton of other things I need to take complete first. 🙂

Source: Hacker News

– Suramya

August 14, 2020

Updating the BIOS to address a AMD Ryzen bug

Filed under: Computer Related,Computer Software,Tech Related — Suramya @ 5:13 PM

Over the past few months I have been infrequently seeing the following warning message in the Terminal and had been ignoring it because apparently the fix was to update the BIOS and I didn’t have the patience/time to do the upgrade at that point in time:

WARNING: CPU random generator seem to be failing, disable hardware random number generation
WARNING: RDRND generated: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
WARNING: CPU random generator seem to be failing, disable hardware random number generation
WARNING: RDRND generated: 0xffffffff 0xffffffff 0xffffffff 0xffffffff

Today I thought that I should fix the error, a bit of Google searching confirmed that I needed to update the BIOS because apparently there was a bug in the AMD Ryzen 3000 series processor that causes the onboard random number generator to always return 0xffffffff when asked to generate a Random number. Obviously getting the same number every time is not optimal even though Dilbert feels otherwise.


Random Number Generator in Accounting

AMD was notified about it last year and they released a BIOS update to fix the issue, however each Motherboard company had to validate and release the new BIOS which took time. The fix was to upgrade the BIOS and I really wasn’t looking forward to it as the last time I upgraded the BIOS it was a painful exercise involving floppy disks and cursing etc.

I looked up my BIOS version using the dmidecode command but that didn’t give me enough information to find the new BIOS version for my motherboard (‘ROG STRIX X570-E GAMING’). So I rebooted the computer and found the built in BIOS upgrade section under Tools. I decided to give it a try and see what options are available so I clicked on the Upgrade option and it gave me the option of connecting to the Internet and automatically downloading the latest version of the BIOS or installing it from a USB/Disk Drive. I selected the Network Install option and the system happily downloaded the latest version of the BIOS from the Internet and then gave me the option to Install the new version. I selected ‘Yes’ and the BIOS was upgraded.

The system had to reboot a few times for the upgrade to complete and there was a boot where the system played a bunch of beeps without anything coming up on the display which scared the life out of me but then it immediately rebooted and the display came back. After the upgrade completed I got a screen with a bunch of messages about BIOS settings needing to be reinitialized but when I went into the BIOS the settings were all there. So I rebooted and now all looks good and I don’t see any more weird error messages in the Console or the logs.

I am happy to see that the process to upgrade the BIOS is now so simple and I will be upgrading the BIOS more frequently going forward.

– Suramya

July 27, 2020

Cloaking your Digital Image using Fawkes to thwart unauthorized Deep Learning Models

Filed under: Computer Related,Computer Software,My Thoughts,Tech Related — Suramya @ 3:42 PM

Unless you have been living under a rock you have seen or heard about facial recognition technologies that are actively in use in the world. You have the movie/TV version where a still image from a video feed is instantly compared to every image in the database to match a perp, then you have the real world example where there are systems that take all your social media feeds, images of yours posted anywhere as a dataset to train a system that can identify you from a video feed (not as quickly as the TV version but still fast).

So what is the way to prevent this? Unfortunately there isn’t one (or at least there wasn’t a realistic one till recently). Earlier you had to ensure that no image of yours is ever posted online, you are never caught in a security feed or traffic cam anywhere. Which as you can imagine is pretty impossible in today’s connected world. Even if I don’t post a picture of me online, my friends with whom I attended a party might upload a pic with me in the background and tag me. Or you get peer pressured to upload the photos to FB or Twitter etc.

There is not much we can do about state sponsored learning models but there are plenty of other folks running unauthorized setups that consume photos posted publicly without permission to train their AI models. These are the systems targeted by folks from the SAND Lab at University of Chicago who have developed Fawkes1, an algorithm and software tool (running locally on your computer) that gives individuals the ability to limit how their own images can be used to track them.

At a high level, Fawkes takes your personal images, and makes tiny, pixel-level changes to them that are invisible to the human eye, in a process we call image cloaking. You can then use these “cloaked” photos as you normally would, sharing them on social media, sending them to friends, printing them or displaying them on digital devices, the same way you would any other photo. The difference, however, is that if and when someone tries to use these photos to build a facial recognition model, “cloaked” images will teach the model an highly distorted version of what makes you look like you. The cloak effect is not easily detectable, and will not cause errors in model training. However, when someone tries to identify you using an unaltered image of you (e.g. a photo taken in public), and tries to identify you, they will fail.

The research and the tool will be presented at the upcoming USENIX Security Symposium, to be held on August 12 to 14. The software is available for download at the projects GitHub repository and they welcome contributions.

It would be amazing when this tool matures and I can imagine it becoming a default part of operating systems so that all images uploaded get processed by the tool by default reducing the risk of automatic facial recognition. Although I can’t imagine any of the governments/Facebook being too happy about this tool being publicly available. 🙂

Well this is all for now. Will write more later.

Thanks to Schneier on Security for the initial link.

– Suramya

March 4, 2020

Seti@home project to stop distributing new work to clients after 21 years

Filed under: Computer Software,News/Articles,Tech Related — Suramya @ 1:44 PM

Seti@home has a fond place in my heart. It has been run by the Berkeley SETI Research Center since 1999, and I think I installed it on my machine sometime around Dec 1999 or early 2000 after hearing about it from one of the News websites (possibly Slashdot). Once I had it running on my system I was pushing to get it installed on all the computers in the University computer lab as they were running 24/7 and I saw that as a wasted opportunity for furthering search for ET. I ran it constantly on my computers till about 2009 post which I switched to running Folding@home which is more focused on Science / DNA sequencing / Medical research. Seti was one of the first Distributed computing systems that I know of and the amount of data processed by computers under its umbrella is staggering.

On March 31, the project will stop sending out new work units to users and the project will instead start focusing on analyzing all the blips identified by volunteers’ machines which could be potential evidence of aliens with an eye on publishing a research paper. Once this is completed they might start pushing out work packages again but that will be a while before it happens.

“It’s a lot of work for us to manage the distributed processing of data. We need to focus on completing the back-end analysis of the results we already have, and writing this up in a scientific journal paper,” their news announcement stated.

Looking forward to reading the research paper and conclusions generated by the Seti@home program.

Source: SETI@home Search for Alien Life Project Shuts Down After 21 Years

– Suramya

March 2, 2020

Another magical AI to detect “Inappropriate photos” and block kids from taking them

Filed under: Computer Software,My Thoughts,News/Articles,Tech Related — Suramya @ 11:50 AM

In today’s iteration of people who don’t want to make the effort of raising their kids and explaining the difference between right & wrong and why something might be a bad idea we have a new “magical” AI that will automatically detect when kids are making a bad choice and stop them. I mean why should a parent make an effort to talk to their kids and help them understand what repercussions of a given choice could be wrong when you have AI to make the effort for them? This new AI is being pitched to parents and has an AI-powered “Smartphone Protection” feature that prevents users from shooting or saving “inappropriate” photos (read: naked pictures).

The official Tone Mobile press release hails the TONE e20 as the world’s first phone with an AI that “regulates inappropriate images” through an AI built into the so-called TONE Camera… If the AI recognizes that the subject of a photo is “inappropriate,” the camera will lock up; and if you somehow manage to snap a photo before the AI kicks in, the phone won’t let you save or share it.

Additionally, a feature called “TONE Family” can be set to send an alert to parents whenever an inappropriate image is detected. According to SoraNews24, this alert will contain location data and a pixelated thumbnail of the photo in question.

I give it about 24 hours from when the phone is released till folks figure out a way around it.

The other issue I have with this system is how its going to classify the pics. The article doesn’t go into technical details of how the AI works and if the classification is done locally or on the cloud. If its on the cloud then every pic taken by that phone is being uploaded to a remote server owned by a 3rd party. This is a massive risk and any breach of that server is going to have a lasting and significant impact. Trust me when I say that this server would be a target of all Black Hat hackers as soon as it goes online.

I am not going to go into whether taking nude pics is a good idea or not. Its upto the people involved to take that decision, I am not responsible for what you do with your phone. If you have to take naughty pics just ensure you follow basic rules and don’t share it with anyone you don’t trust 100%.

In summary, Dear parents: Instead of offloading your responsibilities to AI try having a frank and open conversation with your kids about why certain things might be a bad idea. It will give you better results than this snakeoil.

Source: Slashdot.org

– Suramya

January 6, 2020

Using Math to figure out why One Knot Better Than Another

Filed under: Interesting Sites,News/Articles — Suramya @ 3:12 PM

Have you ever wondered why certain knots are more stable than others? Or have you stressed about which knot is the most suitable one to use in your specific usecase and had a disagreement with someone about the best option? If so then fear-not MIT researchers have developed a mathematical model to predict a knot’s stability and now you can argue for your choice with conviction that math supports your choice. From the paper’s abstract:

Knots play a fundamental role in the dynamics of biological and physical systems, from DNA to turbulent plasmas, as well as in climbing, weaving, sailing, and surgery. Despite having been studied for centuries, the subtle interplay between topology and mechanics in elastic knots remains poorly understood. Here, we combined optomechanical experiments with theory and simulations to analyze knotted fibers that change their color under mechanical deformations. Exploiting an analogy with long-range ferromagnetic spin systems, we identified simple topological counting rules to predict the relative mechanical stability of knots and tangles, in agreement with simulations and experiments for commonly used climbing and sailing bends. Our results highlight the importance of twist and writhe in unknotting processes, providing guidance for the control of systems with complex entanglements.

To give some more context, below is an extract from a SciTech Daily Article covering the research. To be honest I had to read the article a few times to understand what they were talking about but it sounded interesting. Not sure how useful it is but is definitely interesting. 🙂

In comparing the diagrams of knots of various strengths, the researchers were able to identify general “counting rules,” or characteristics that determine a knot’s stability. Basically, a knot is stronger if it has more strand crossings, as well as more “twist fluctuations” — changes in the direction of rotation from one strand segment to another.

For instance, if a fiber segment is rotated to the left at one crossing and rotated to the right at a neighboring crossing as a knot is pulled tight, this creates a twist fluctuation and thus opposing friction, which adds stability to a knot. If, however, the segment is rotated in the same direction at two neighboring crossing, there is no twist fluctuation, and the strand is more likely to rotate and slip, producing a weaker knot.

They also found that a knot can be made stronger if it has more “circulations,” which they define as a region in a knot where two parallel strands loop against each other in opposite directions, like a circular flow.

By taking into account these simple counting rules, the team was able to explain why a reef knot, for instance, is stronger than a granny knot. While the two are almost identical, the reef knot has a higher number of twist fluctuations, making it a more stable configuration. Likewise, the zeppelin knot, because of its slightly higher circulations and twist fluctuations, is stronger, though possibly harder to untie, than the Alpine butterfly — a knot that is commonly used in climbing.

The formal paper is published at: Science Mag.
Thanks to Slashdot for the initial link.

– Suramya

January 3, 2020

Computer made from 32 strands of DNA can now compute the square root of 900

Filed under: News/Articles,Tech Related — Suramya @ 4:28 PM

Early this century (around year 2000 onwards) there were three main projects goingon in parallel, each of which promised to be the next great breakthrough in Computing which would change the world. These were: DNA Computing, Optical Computing and Quantum computing. Then, something changed and Quantum computing took over. In the past few years the tech news & papers have primarily focused on Quantum Computing breakthroughs (which to be fair have been quite significant) and Optical & DNA Computers on the other hand seemed to have dropped off the map with hardly any news coming from that front. But that has just changed. Thanks to the efforts of Chunlei Guo and his colleagues at the University of Rochester, New York we now have a working DNA computer that uses 32 strands and can compute the square root of square numbers 1, 4, 9, 16, 25 and so on up to 900. This might not sound like much but is a pretty big deal as now that we can create a system that uses chemistry to compute square roots we can probably get DNA circuits to do anything.

The prospect of programming molecular computing systems to realize complex autonomous tasks has advanced the design of synthetic biochemical logic circuits. One way to implement digital and analog integrated circuits is to use noncovalent hybridization and strand displacement reactions in cell‐free and enzyme‐free nucleic acid systems. To date, DNA‐based circuits involving tens of logic gates capable of implementing basic and complex logic functions have been demonstrated experimentally. However, most of these circuits are still incapable of realizing complex mathematical operations, such as square root logic operations, which can only be carried out with 4 bit binary numbers. A high‐capacity DNA biocomputing system is demonstrated through the development of a 10 bit square root logic circuit. It can calculate the square root of a 10 bit binary number (within the decimal integer 900) by designing DNA sequences and programming DNA strand displacement reactions. The input signals are optimized through the output feedback to improve performance in more complex logical operations. This study provides a more universal approach for applications in biotechnology and bioengineering.

The paper published in “Small” has more details but is behind a paywall (which sucks) so I don’t have much more details than what the New Scientist article and the paper abstract share. At the price they are asking I don’t think its value for money just so that I can satisfy my curiosity about the breakthrough. If you disagree and download the paper, please share 🙂

Looking forward to more such news (in a accessible journal) in 2020.

– Suramya

October 31, 2019

You can’t have ‘b’, ‘l’, ‘m’, ‘r’, and ‘t in your password if you are using macOS 10.15.1 aka Catalina

Filed under: Funny News,My Thoughts,Tech Related — Suramya @ 12:50 PM

Users of Twitter App on macOS 10.15.1 aka Catalina just found out that they couldn’t log in to their account if their password contained any of the following characters: ‘b’, ‘l’, ‘m’, ‘r’. When I first read the news I thought it was a joke but then realized that its an actual issue in the latest version of the MacOS. The problem is showing up on the Twitter app but other programs might be effected as well.

According to Twitter in-house developer Nolan O’Brien, these particular keypresses are gobbled up by a regression associated with the operating system’s shortcut support. Normally, users can press those aforementioned keys as shortcuts within the app to perform specific actions, such as ‘t’ to open a box to compose a new tweet.

Something changed within macOS to capture those shortcut keys, rather than pass them to the password field in the user interface as expected. So, in other words, when you press a shortcut key in Twitter when entering an account password, the keypress is ignored in that context rather than handled as a legit password keypress.

This reminded me of the weird and basic bugs that showed up in older versions of Windows. Apple really needs to work on their quality control if they want to stay in the game.

Source: The Register: You’e yping i wong: macOS Catalina stops Twitter desktop app from accepting B, L, M, R, and T in passwords

– Suramya

« Newer PostsOlder Posts »

Powered by WordPress