Suramya's Blog : Welcome to my crazy life…

October 31, 2023

Firefox built-in local translation works quite well

Filed under: Tech Related — Suramya @ 11:59 PM

Firefox recently released Firefox 118 and one of the interesting features in the release was the inclusion of the local translation of websites. Meaning that all the translation was done locally on the machine running Firefox without sending the content to an external service such as Google Translate.

I have been using it infrequently and am impressed with the quality of the translations. Historically the local translation tools don’t seem to be able to translate well and most of the times we end up with a literal translation of each word. Firefox translate is high quality and uses language packs, which the user has to download once to the local system. Post which the system can start translating websites. The supported languages in the initial release were English, German, French, Italian, Spanish, Portuguese, Dutch, Polish and Bulgarian. Support for additional languages is being added in an iterative manner.

The next release of Firefox (120) will have support for new languages: Catalan, Czech, Estonian, Finnish, Hungarian, Icelandic, Norwegian (Bokmål and Nynorsk), Persian, Russian, Ukrainian. You can try them out in the nightly build but the support is still a work in progress and not ready for prime time use. I am waiting for support for the Indian languages to be added along with support for pages which have content in a mix of languages.

You should download the latest version of Firefox and try it out. It is free and doesn’t have all the monitoring tools that Chrome has built-in.

– Suramya

October 28, 2023

New tool called Nightshade allows artists to ‘poison’ AI models

Filed under: Artificial Intelligence,Tech Related — Suramya @ 12:20 AM

Generative AI has burst into the scene with a bang and while the Image generation tech is not perfect yet it is getting more and more sophisticated. Due to the way the tech works, the model needs to be trained on existing art and most of the models in the market right now have been trained on artwork available on the internet whether or not it was in the public domain. Because of this multiple lawsuits have been filed against AI companies by artists.

Unfortunately this has not stopped AI models from using these images as training data, so while the question is being debated in the courts, the researchers over at University of Chicago have created a new tool called Nightshade that allows artists to poison the training data for AI models. This functionality will be an optional setting in the their prior product Glaze, which cloak’s digital artwork and alter its pixels to confuse AI models about its style. Nightshade goes one step further by making the AI learn the wrong names for objects etc in a given image.

Optimized prompt-specific poisoning attack we call Nightshade. Nightshade uses multiple optimization techniques (including targeted adversarial perturbations) to generate stealthy and highly effective poison samples, with four observable benefits.

  • Nightshade poison samples are benign images shifted in the feature space. Thus a Nightshade sample for the prompt “castle” still looks like a castle to the human eye, but teaches the model to produce images of an old truck.
  • Nightshade samples produce stronger poisoning effects, enabling highly successful poisoning attacks with very few (e.g., 100) samples.
  • Nightshade samples produce poisoning effects that effectively “bleed-through” to related concepts, and thus cannot be circumvented by prompt replacement, e.g., Nightshade samples poisoning “fantasy art” also affect “dragon” and “Michael Whelan” (a well-known fantasy and SciFi artist).
  • We demonstrate that when multiple concepts are poisoned by Nightshade, the attacks remain successful when these concepts appear in a single prompt, and actually stack with cumulative effect. Furthermore, when many Nightshade attacks target different prompts on a single model (e.g., 250 attacks on SDXL), general features in the model become corrupted, and the model’s image generation function collapses.

In their tests the researchers poisoned images of dogs to include information in the pixels that made it appear to an AI model as a cat. After sampling and learning from just 50 poisoned image samples, the AI began generating images of dogs with strange legs and unsettling appearances. After 100 poison samples, it reliably generated a cat when asked by a user for a dog. After 300, any request for a dog returned a near perfect looking cat.

Obviously this is not a permanent solution as the AI training models will start working on fixing this issue immediately and then the whack-a-mole process of fixes/updates to one up will continue (similar to how virus & anti-virus programs have been at it) for the foreseeable future.

Full paper: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models (PDF)
Source: Venturebeat: Meet Nightshade, the new tool allowing artists to ‘poison’ AI models

– Suramya

October 26, 2023

Its ok to ask questions about basic stuff that ‘everyone’ knows about

Filed under: My Thoughts,Tech Related — Suramya @ 12:12 PM

There is a well known meme where people talk about how the questions they asked were ‘cringe’ and make fun of the questions people ask. One such example is the comic below that showed up in my feed. Here the refrain is that having to read all the questions that someone has posted on Google/ChatGPT about programming is equivalent to Torturing them because of the implication being that the questions were so basic that everyone should know the answer to them. I get that people are trying to be funny but there is a problem with these kinds of posts because it actively discourages people from asking questions, it builds the narrative that people who post ‘stupid’ questions are not smart and their questions are cringe. It actively promotes the imposter syndrome because people start thinking that they don’t know much when they have to search for ‘basic’ stuff.

Let the torture commence. Let’s reveal all the coding related questions you asked on Google and ChatGPT

Instead I prefer the XKCD approach called the 10,000.

In this strip, Randall presents a mathematical argument against the idea of making fun of people for their ignorance.

There are so many things that I know that others don’t, just as there are so many things that you know that I don’t know. This is because each of us has different life experiences/upbringing etc. Expecting everyone to know the same things as you do is super egoistical.

I have been a developer for about 25+ years now and still I look up syntax when I am coding. Knowing the proper syntax for a command doesn’t make you a programmer, knowing what command/logic to use is what makes a programmer. I can always look up the syntax but the basic logic to solve the problem is something that I have to come up with and that is what I usually test when interviewing people. I need people who can solve problems not someone who can regurgitate the syntax for a function in C++/Python.

When I was in high-school (10th Standard) my senior project was to create an address book where we used the locate command quite extensively to make the output pretty (this was in GW-BASIC). So in my preboard exams, during the viva I was asked to give the syntax of the locate command. I always got confused on the parameters for this function and couldn’t remember if it was LOCATE [row][,[col] or LOCATE [col],[row]. I guessed and gave the wrong order so the teacher told me that she doubted that I had coded the program as I didn’t even know the syntax of the command. I responded by telling her that I don’t need to remember the syntax because I can refer to the book when I need to know the syntax but the logic of the program is what I focused on and challenged her to quiz me on that. I remember she was pretty taken aback by this and I did get a good score on the viva but she told me not to be so blunt during the actual board exam viva’s.

I have sat in meetings where people have talked about concepts or used examples I had no clue about and sometimes I would interrupt to ask for clarifications and in other times I would make a note and do lot of research before the next meeting so I understood what we were talking about. I am not saying that people shouldn’t do research or put in effort before asking questions. I am saying that we need to be supportive of new comers into the field who don’t have the experience to know all the things that might be obvious to you. In the past I used to refer folks to the How To Ask Questions The Smart Way by Eric Steven Raymond & Rick Moen when I talked about how to ask questions. However as I have gotten older and more experienced I find that while the FAQ has some good points it is absolutely condescending and not really the right approach to asking questions. So Instead of that I now refer people to Julia Evan’s post on How to ask good questions.

How to ask Good Questions

Teaching people that it is ok to ask questions is an important part of being a mentor and training the next generation.

– Suramya

October 23, 2023

HashiCorp CEO attempts to gaslight folks into believing that foundations of Open Source are bad

Filed under: My Thoughts,Tech Related — Suramya @ 10:03 PM

On August 10, 2023, HashiCorp decided to change the license for its products (including Terraform) from the open source under the MPL v2 license to a non-open source BSL v1.1 license, starting from the next (1.6) version. This change was met with widespread vocal opposition from the community and when HashiCorp refused to reconsider, the community decided to take action and forked the last stable version of Terraform into a new solution called OpenTofu (Yes really…) and almost immediately over 100 companies, 10 projects, and 400 individuals pledged their time and resources to keep Terraform open-source and the number is growing fast.

With more and more companies looking at alternatives, The Stack has published an interview with the HashiCorp CEO. I am having a hard time figuring out if this is a Troll piece in line with The Onion or an actual interview, because the quotes in the interview are mind boggling. From the interview:

HashiCorp’s CEO predicted there would be “no more open source companies in Silicon Valley” unless the community rethinks how it protects innovation, as he defended the firm’s license switch at its user conference this month.

Every company I have worked with over the decade has worked with Open Source software and companies and I don’t think a single one would make a statement like what he is claiming.

While open source advocates had slammed the license switch, McJannet described the reaction from its largest customers as “Great. Because you’re a critical partner to us and we need you to be a big, big company.”

Indeed, he claimed that “A lot of the feedback was, ‘we wished you had done that sooner’” – adding that the move had been discussed with the major cloud vendors ahead of the announcement.

“Every vendor over the last three or four years that has reached any modicum of scale has come to the same conclusion,” said McJannet.

Here’s my personal favorite:

He said the Linux Foundation’s adoption of Open Tofu raised serious questions. “What does it say for the future of open source, if foundations will just take it and give it a home. That is tragic for open source innovation. I will tell you, if that were to happen, there’ll be no more open source companies in Silicon Valley.”

Because foundations allow communities to continue with products they love and built even when corporate’s try to lock them in a closed garden, it is apparently a bad thing. Yes, it is a bad thing for folks who want to just take advantage of the community, enshittify the product while making money and ignoring the needs of the community. Open Source allows people to fork the product (which was built with Open Source contributions), keep it free and out of the greedy hands of corporate CEOs. That’s the beauty of Open Source and this article is a poor attempt to gaslight folks into believing that this is a bad thing.

– Suramya

October 19, 2023

How to approach a topic to make learning hard things easy?

Filed under: Interesting Sites,My Thoughts,Tech Related — Suramya @ 7:16 PM

Talking about complicated topics is hard. I remember reading somewhere that if you can’t explain what you do in simple enough terms that a grandmother can understand it then you don’t know enough about what you are doing. Unfortunately I can’t find the original quote but if you think about it, it makes sense. People who don’t understand a given topic in depth will revert to using acronyms or jargon to explain what they do. Folks who do understand will be able to explain it using small words and concepts. The best example of this is the Thing Explainer: Complicated Stuff in Simple Words a book by Randall Munroe from the XKCD fame. In the book, things are explained in the style of Up Goer Five, using only drawings and a vocabulary of the 1,000 (or “ten hundred”) most common words. Explore computer buildings (datacenters), the flat rocks we live on (tectonic plates), the things you use to steer a plane (airliner cockpit controls), and the little bags of water you’re made of (cells). My Niece and Nephew love the book and refer to it regularly.

Julia Evans recently gave a talk on Making Hard Things Easy that everyone should listen to or read, since she also gave a transcript. Which was awesome else I would have missed out on this great talk. She talks about how to approach a problem/question/topic to make it easier to understand with examples from her own experience.

Julia is a wiz at making difficult topics seem easy. She publishes Webzines that explain computer topics in easy to understand comic format. I have bought all the ones she has published so far as PDF’s and would recommend you do the same. The site above has samples of her work so do check it out.

– Suramya

October 17, 2023

Best Support response times and quality I have seen is from the WordPress Activitypub team

Filed under: Computer Software,My Thoughts,Tech Related — Suramya @ 10:49 PM

I have been using Open Source since I found out about it back in 1999. At present majority of the software I have running on my system is opensource with a few notable exceptions such as Microsoft Word (Libreoffice still has formatting issues) and CrossOver by Code Weavers (that allows me to run Windows software on Linux) and a few games that I don’t get to play enough. Which means that I have considerable experience with the support offered by the various opensource projects. The support ranges from RTFM, no responses to questions or detailed responses from the team/users.

Out of all the projects that I have reached out for support the most fantastic & the fastest support response has been from Matthias Pfefferle (German Site) from the wordpress-activitypub project. I have raised multiple tickets with the project and have always gotten a quick (Fastest response in 2mins!!!), detailed and helpful response to my questions. For the issues I raised, some of them required a code fix and a fix was released within days. I don’t think I have received such a fantastic response even from sites/projects where I am a paying subscriber.

Anyways, we always post about the bad experiences we have so I think that we should also take time to post about the fantastic experiences and people we interact with because there is way too much negative news out there and these small things can help bring a smile to someones face and make sure they know that their hard work is appreciated.

If you run a WordPress Blog (self-hosted) you should definitely install this plugin and federate your posts to Mastodon (and the rest of the fediverse).

– Suramya

October 12, 2023

Someone got fired for not using Windows because the invasive workplace surveillance tool didn’t work well on Linux

Filed under: Linux/Unix Related,My Thoughts,Tech Related — Suramya @ 9:38 PM

There are a lot of reasons why I recommend people don’t use Windows but there are times when you have to use it because it is required for work, or for other reasons such as compatibility (though CrossOver by Codeweavers is a lifesaver for that). Over at HackerNews, there is a thread about a post over at Reddit (I guess people are still using it…) where a guy is claiming that “I Lost my job because I refused to use Windows, who is at fault?”)

I have been using Windows at work at almost every company I have worked with because that is the default and most corporate apps are designed for and work only with Windows systems. Since I personally prefer using Linux I have asked (and in some cases) gotten a Linux version of the desktop for my use. The main blockers for corporations to use something like CrossOver is the problem of support. If a company is running MS Office on Linux using crossover and they hit an issue, MS can and does blame it on the setup and asks you to revert to a standard setup. I have even heard folks claiming that they (MS) have blamed custom plugins that the company was running for the issues were being highlighted.

All that said and done I don’t think I would ever point blank refuse to use windows when my company asks me to run it and threatens termination if I don’t. Though to be honest I would have also started looking for other opportunities if I was in this persons shoes since as per their post the reason for the demand was that: “A software they use for time tracking didn’t support screenshots on Wayland and I refused to switch to Windows (xorg is just no for me) to support them.”.

Having a program running on my personal machine that constantly takes screenshots and uploads them to a remote server is not something I would agree to do. We don’t know what company they were working for but this kind of invasive surveillance might not be 100% legal in all locations. A company might get away with it on work systems if they have a contract and the user explicitly agrees to it but on a personal machine… If the user forgets it is running and accesses their health record, or bank account or other sensitive data their employer would have a copy of that data. Imagine if they got breached, how much sensitive & personal data might get exposed with this setup.

A lot of work has been put into these surveillance technologies and there is a whole industry around monitoring people at work to ensure they are actually working. In a previous company a team wanted to put software on all office computers that would track the time the person was actually typing/moving the mouse etc and use that to calculate their productivity and then rate them on that. After the system was demoed, I asked how it was accounting for time spent in face to face meetings, design discussions, calls etc that don’t necessarily need a computer, the answer was vague enough that the head of the department remarked that if it was implemented every single member of the management team would be rated as non-productive as a majority of their time was in meetings and discussions etc.

During covid a lot of people were worried that folks working from home would not actually work and started tracking mouse/keyboard activity. So people came up with ingenious solutions to ensure that the mouse was moved and text typed on the office systems. Some was done via software/scripts others used hardware and innovation such as taping the mouse to a desk fan amongst other methods.

This kind of monitoring is being routinely done on employees who don’t have much options and are not able to move easily. The end result is that the company is trying to maximize their profit by nano-managing their employees and using this tech to ensure they squeeze all possible work out of them while paying the minimum amount.

Now coming back to the original question, was it wrong to insist on using Linux when the job requires you to use Windows? If the company was giving me a laptop/computer running windows and I formatted it to run Linux then I would be in the wrong. If I am using my own computer then I can use whatever OS I want as long as the work gets done. However if I am insisting on using Linux on a Work computer when they require windows and even after multiple warnings they don’t switch back to Windows then the company is right to fire them. (Assuming that there are no other issues such as the invasive monitoring we talked about earlier.)

There are multiple people who will find this stance unacceptable but there is a rational behind this that not everyone thinks about. The company might be legally required to keep records/logs of work, mails sent etc and the audit requirements would not be met if a non-compliant system was in use. Similarly the default backup and archiving systems might not work with Linux and cause problems. There are a ton of issues that would need to be worked out before having a mixed use OS landscape and if no other considerations are there then the company can be justified in firing such a person who refuses to use Windows because they don’t like it.

Source: Hacker News: Lost my job because I refused to use Windows, who is at fault?

– Suramya

September 5, 2023

Invalid Flight plan submission to UK National Air Traffic Services causes multi-day chaos

Filed under: My Thoughts,Tech Related — Suramya @ 6:50 PM

One of the cardinal rules in computers is to “never trust the input” or put another way: “Never trust user input”. If you ever wondered what would happen if this wasn’t followed here’s a real world example that happened late last month (28th Aug) where almost every flights to and from the UK were delayed or cancelled after their air traffic control systems went down.

An analysis of the crash found that a French airline had filed a flight path in the wrong format to the National Air Traffic Services (NATS) and instead of rejecting the plan because it was in an invalid format as it should have done the entire system went down hard. This is a basic programming principle and I am not sure why their testing didn’t catch this massive vulnerability. Basically, it looks like anyone with access to file a flight plan can crash the entire NATS just by submitting a flight plan in the wrong format.

Apparently it is expected behavior as per NATS chief executive Martin Rolfe, who said that both Primary AND Backup systems responded to the incorrect flight data by suspending automatic processing “to ensure that no incorrect safety-related information could be presented to an air traffic controller or impact the rest of the air traffic system”

Nats chief executive, Martin Rolfe, told BBC Radio 4’s Today programme: “It wasn’t an entire system failure. It was a piece of the system, an important piece of the system.

“But in those circumstances, if we receive an unusual piece of data that we don’t recognise, it is critically important that that information – which could be erroneous – is not passed to air traffic controllers.”

Mr Rolfe said Nats has “safety-critical systems” and “throwing data away needs to be very carefully considered”.

To me it is unbelievable that anyone thought that crashing both the Primary and Backup systems was preferable to throwing away an invalid flight plan.


– Suramya

August 31, 2023

Using LLM’s to change writing style to hide author?

Filed under: Artificial Intelligence,My Thoughts,Tech Related — Suramya @ 12:17 PM

It is fairly well known that folks can identify a writer based on their writing style. In fact there is a whole field of work called Stylometry that analyses writings to try to attribute authorship of documents/writings. This is used in when new text/writings are found that are not attributed to any person or when agencies receive letters or there are articles posted on extremist sites.

In 1964, Frederick Mosteller and David Wallace published a three-year study of the distribution of common words in the Federalist Papers and showed that the writing style of Alexander Hamilton and James Madison differed in subtle ways. The researchers analyzed some disputed papers that both writers claimed to have written and based on their analysis of the respective writing styles of both authors were able to identify that the papers were likely to have come from Madison’s pen. Since then the technique has only improved, with AI and Machine learning algorithms coming into play researchers are able to identify the writing style from a post quickly and mostly accurately. Keep in mind that this is still more of an art than a science. However, it works well enough that people were able to identify the author of a book: “The Cuckoo’s Calling, by Robert Galbraith” as none other than JK Rowling.

Now, as we all know “With every action there is an equal and opposite reaction” and there has been a lot of work put into what is called Adversarial stylometry. Which is the practice of consciously changing the writing style when writing so remove the identifying characteristics that can be used to figure out who wrote the text. Historically, this was achieved by paraphrasing the text and by having another person edit & modify the content.

With the recent advances in LLM especially systems like ChatGPT it might be possible to use them to whitewash the text and remove enough identifying marks that the text becomes relatively more anonymous. I tried it out with the previous paragraph as the input and got the following as the result:

As commonly understood, the principle of “For every action, there is an opposing reaction” holds true. Considerable effort has been dedicated to a field known as Adversarial Stylometry. This involves intentionally altering one’s writing style to eliminate distinctive traits that could be employed to determine the authorship of a text. In the past, this was accomplished by rephrasing the content and enlisting the assistance of a third party to revise and adjust the material.

As you can see the output is significantly different in both words, tone and phrasing that it would it hard to attribute correctly. Do it a few times in a loop and after ensuring the original meaning is not lost you can use it and be relatively sure that it will be difficult to identify the author of the text.

Obviously, if LLM’s are going to be used to obfuscate text then other LLM’s will be created that try to de-obfuscate the text and the systems will keep getting more and more smarter.

– Suramya

August 29, 2023

Excel holding up the Global Financial System, now with Python support

Filed under: Computer Security,Computer Software,My Thoughts,Tech Related — Suramya @ 1:12 PM

It is both impressive and scary how much of the world’s financial systems is being run using Microsoft Excel. Folks have created formulars/macros/scripts/functions etc in Excel that allows them to generate data that is used to take major financial decisions with real world impact.

In one of my previous companies we actually had a full discussion on how to get an inventory of all the Excel code in use at the company and how to archive it so that we have backups and version control on them. Unfortunately, I left before much headway was made but I did learn enough about excel use to scare me. (Especially since I am not the biggest fan of Microsoft software 😉 )

Now you might ask why so many people are using excel when there are better tools available in the market and these companies have inhouse teams to create custom software for the analyst and I asked the exact same questions when I started. I think it is probably because the tool makes it easy for folks to come up with formulas and scripts that get their work done instead of having to wait for an external team to make the changes etc that they need.

Now, a few days ago Microsoft made a surprise announcement that going forward they are going to support running Python inside an Excel file. Yikes!! In order to use this functionality you will need to be part of the Microsoft 365 Insider program and then you can type Python code directly into cells using the new =PY() function, which then gets executed in the cloud. From what I have read, this will be enabled by default and needs to be disabled via a registry key.

Since its inception, Microsoft Excel has changed how people organize, analyze, and visualize their data, providing a basis for decision-making for the millions of people who use it each day. Today we’re announcing a significant evolution in the analytical capabilities available within Excel by releasing a Public Preview of Python in Excel. Python in Excel makes it possible to natively combine Python and Excel analytics within the same workbook – with no setup required. With Python in Excel, you can type Python directly into a cell, the Python calculations run in the Microsoft Cloud, and your results are returned to the worksheet, including plots and visualizations.

We already have issues with Excel Macros being used as vectors for malware & viruses, this just opens a whole new front in that war. Now, admins will have to worry about attackers using Python in Excel to infiltrate the organization or to send data outside the org. I can see how it is useful for people working with datasets and MS is adding this functionality to keep up with other tools such as Tableau etc which are more powerful but still I feel that this is a bad move.

Another problem that folks are going to face is that now your Excel sheets have Python programs inside them, how are we supposed to version the code, how is code review done? Basically this code should be going through the standard SDLC (Software Development Life Cycle) process but wouldn’t. We also need to ensure that all changes are reviewed and monitored to protect against insider attacks but the way the system is setup this is going to be extremely difficult (We have already seen that with Macros and Formulas etc).

Lets see how folks address this risk profile.

– Suramya

Older Posts »

Powered by WordPress