Suramya's Blog : Welcome to my crazy life…

October 28, 2023

New tool called Nightshade allows artists to ‘poison’ AI models

Filed under: Artificial Intelligence,Tech Related — Suramya @ 12:20 AM

Generative AI has burst into the scene with a bang and while the Image generation tech is not perfect yet it is getting more and more sophisticated. Due to the way the tech works, the model needs to be trained on existing art and most of the models in the market right now have been trained on artwork available on the internet whether or not it was in the public domain. Because of this multiple lawsuits have been filed against AI companies by artists.

Unfortunately this has not stopped AI models from using these images as training data, so while the question is being debated in the courts, the researchers over at University of Chicago have created a new tool called Nightshade that allows artists to poison the training data for AI models. This functionality will be an optional setting in the their prior product Glaze, which cloak’s digital artwork and alter its pixels to confuse AI models about its style. Nightshade goes one step further by making the AI learn the wrong names for objects etc in a given image.

Optimized prompt-specific poisoning attack we call Nightshade. Nightshade uses multiple optimization techniques (including targeted adversarial perturbations) to generate stealthy and highly effective poison samples, with four observable benefits.

  • Nightshade poison samples are benign images shifted in the feature space. Thus a Nightshade sample for the prompt “castle” still looks like a castle to the human eye, but teaches the model to produce images of an old truck.
  • Nightshade samples produce stronger poisoning effects, enabling highly successful poisoning attacks with very few (e.g., 100) samples.
  • Nightshade samples produce poisoning effects that effectively “bleed-through” to related concepts, and thus cannot be circumvented by prompt replacement, e.g., Nightshade samples poisoning “fantasy art” also affect “dragon” and “Michael Whelan” (a well-known fantasy and SciFi artist).
  • We demonstrate that when multiple concepts are poisoned by Nightshade, the attacks remain successful when these concepts appear in a single prompt, and actually stack with cumulative effect. Furthermore, when many Nightshade attacks target different prompts on a single model (e.g., 250 attacks on SDXL), general features in the model become corrupted, and the model’s image generation function collapses.

In their tests the researchers poisoned images of dogs to include information in the pixels that made it appear to an AI model as a cat. After sampling and learning from just 50 poisoned image samples, the AI began generating images of dogs with strange legs and unsettling appearances. After 100 poison samples, it reliably generated a cat when asked by a user for a dog. After 300, any request for a dog returned a near perfect looking cat.

Obviously this is not a permanent solution as the AI training models will start working on fixing this issue immediately and then the whack-a-mole process of fixes/updates to one up will continue (similar to how virus & anti-virus programs have been at it) for the foreseeable future.

Full paper: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models (PDF)
Source: Venturebeat: Meet Nightshade, the new tool allowing artists to ‘poison’ AI models

– Suramya

October 9, 2023

Microsoft AI responds with absolute nonsense when asked about a prominent Cyber Security expert

Filed under: Artificial Intelligence,Computer Software — Suramya @ 11:39 PM

The more I read about the Microsoft implementation of ‘AI’ the more I wonder what on earth are they thinking? Their AI system is an absolute shambles and about 99% of the output is nonsense. See the example below:

I did not realise how inaccurate Microsoft's Al is. It's really bad
Microsoft AI returns absolute nonsense when asked about who Kevin Beaumont is

I did not realise how inaccurate Microsoft’s Al is. It’s really bad This is just one example – it lists a range of lawsuits I’ve filed, but they’re all fictional – it invented them and made up the citations. It says I gave Microsoft’s data to @briankrebs. It says Krebs is suing me. It says @malwaretech works for me. The list goes on and on. Very eyebrow raising this is being baked into next release of Windows 11 and Office. It will directly harm people who have no knowledge or recourse.

I mean I can understand if it got one or two facts wrong because the data sources might not be correct, but to get every single detail wrong requires extra skill. The really scary part is that Google AI search is not much better and both companies are in a race to replace their search engine with AI responses. Microsoft is going a step further and including it as a default option in Windows. I wonder how much of the user data being stored on a windows computer is being used to train these AI engines.

There needs to be an effort to create a search engine that filters out these AI generated responses and websites to go back to the old style search engines that actually returned useful & correct results.

– Suramya

August 31, 2023

Using LLM’s to change writing style to hide author?

Filed under: Artificial Intelligence,My Thoughts,Tech Related — Suramya @ 12:17 PM

It is fairly well known that folks can identify a writer based on their writing style. In fact there is a whole field of work called Stylometry that analyses writings to try to attribute authorship of documents/writings. This is used in when new text/writings are found that are not attributed to any person or when agencies receive letters or there are articles posted on extremist sites.

In 1964, Frederick Mosteller and David Wallace published a three-year study of the distribution of common words in the Federalist Papers and showed that the writing style of Alexander Hamilton and James Madison differed in subtle ways. The researchers analyzed some disputed papers that both writers claimed to have written and based on their analysis of the respective writing styles of both authors were able to identify that the papers were likely to have come from Madison’s pen. Since then the technique has only improved, with AI and Machine learning algorithms coming into play researchers are able to identify the writing style from a post quickly and mostly accurately. Keep in mind that this is still more of an art than a science. However, it works well enough that people were able to identify the author of a book: “The Cuckoo’s Calling, by Robert Galbraith” as none other than JK Rowling.

Now, as we all know “With every action there is an equal and opposite reaction” and there has been a lot of work put into what is called Adversarial stylometry. Which is the practice of consciously changing the writing style when writing so remove the identifying characteristics that can be used to figure out who wrote the text. Historically, this was achieved by paraphrasing the text and by having another person edit & modify the content.

With the recent advances in LLM especially systems like ChatGPT it might be possible to use them to whitewash the text and remove enough identifying marks that the text becomes relatively more anonymous. I tried it out with the previous paragraph as the input and got the following as the result:

As commonly understood, the principle of “For every action, there is an opposing reaction” holds true. Considerable effort has been dedicated to a field known as Adversarial Stylometry. This involves intentionally altering one’s writing style to eliminate distinctive traits that could be employed to determine the authorship of a text. In the past, this was accomplished by rephrasing the content and enlisting the assistance of a third party to revise and adjust the material.

As you can see the output is significantly different in both words, tone and phrasing that it would it hard to attribute correctly. Do it a few times in a loop and after ensuring the original meaning is not lost you can use it and be relatively sure that it will be difficult to identify the author of the text.

Obviously, if LLM’s are going to be used to obfuscate text then other LLM’s will be created that try to de-obfuscate the text and the systems will keep getting more and more smarter.

– Suramya

August 19, 2023

ChatGPT is not an all powerful AI it can’t even perform simple multiplications

Filed under: Artificial Intelligence — Suramya @ 9:49 PM

In my previous post I was talking about how I am usually quite skeptical of the claims being made about AI and how it is going to solve the world’s problems. There are some good usecases and there are some advantages to adding Machine Learning to the system. I have written about a few of them here.

But we are nowhere close to the Singularity and most of the folks who are pushing AI right now are the same ones who where pushing NFT’s earlier and Crypto before that… Case in point, according to founder of AI is an existential threat and OpenAI ChatGPT is a showcase of how advanced AI has become. However, this ‘advanced’ AI can’t even multiply two numbers to get the correct answer. I asked ‘What is 7467*5730’ and it gave me ‘42,777,810’. Then I asked my calculator to perform the same multiplication and that gave me: (Drum roll please…) 42,785,910. (See the screenshots below)


ChatGPT can’t perform a simple multiplication

Others have faced the same issue and I rest my case about why we need to be very careful using these AI assistants because they give answers that look legitimate and correct but are completely wrong or have subtle errors that need expertise to identify.

– Suramya

August 18, 2023

The Judge by GoblinTools is a great tool to tell how written text comes across qualitatively

Readers of this blog (and those who know me) know that I am usually quite skeptical of tools that claim to judge the emotional content of text or tell how the text comes across qualitatively. However, this new site called The Judge – Goblin Tools is quite impressive. As per James over on Mastodon, the tool was created to help neurodivergent folks communicate better with the normals avoiding misunderstanding, bad assumptions about intent, etc.

I tested it out with content from my blog and I have to say that it summarized the tone and content quite nicely. For example below is the summary it gave for my post on Celebrating India’s 76th Independence day!

Emotionally, this text comes across as proud, patriotic, and reflective. It begins by acknowledging and celebrating India’s 76th Independence Day, recognizing the significance of gaining freedom from British rule. It highlights the sacrifices, bloodshed, and tears endured by many individuals who fought for India’s freedom, emphasizing that not all their names are known and that this is a tragic aspect. The text expresses a sense of frustration with the tendency to focus solely on iconic figures like Gandhi and Nehru, disregarding the efforts of numerous other freedom fighters.

As the text looks towards the future, there is a strong emotional undertone of remembrance and gratitude towards those who paved the way for freedom. It cites famous slogans from freedom fighters, adding a sense of inspiration and reverence for their contribution. The passage acknowledges the importance of remembering the sacrifices made by past generations for the betterment of the nation.

There is a shift in emotion when it proudly proclaims India’s progress and achievements in various fields, such as technology, space exploration, and healthcare. It displays a sense of national pride and triumph over past perceptions that diminished India’s image. The text mentions India’s relief efforts during the COVID-19 pandemic and positions India as a nation capable of compassionate and humanitarian actions.

Moving forward, the text acknowledges areas that still require improvement, alluding to societal customs and mindsets. It expresses a desire for growth and change while also highlighting the need for self-reflection in other countries that criticize India, pointing out their own flaws and challenges.

The passage concludes with a message of unity and inclusivity, promoting the idea that humanity should transcend differences in caste, religion, nationality, sexuality, and age. It embraces the concept of a global family and emphasizes the values of humanity and national loyalty.

Overall, this text conveys a mixture of pride in India’s accomplishments, gratitude for the sacrifices made by freedom fighters, and a sense of reflection on the challenges that still exist. It aims to inspire unity, progress, and a commitment to the values that define the nation.

It did manage to capture the exact tone I was going for so that is quite nice. According to their About Us page, the site uses AI technologies in the backend including OpenAI’s models. They are planning to move to open source alternatives as and when they become available and reliable. Keeping that in mind, I would not recommend using this to analyze any text that has private or confidential information because it might be used to train OpenAI’s models. That being said I would be keeping an eye on this software and once they go to an opensource model and allow us to run a local instance of the tool I certainly wouldn’t mind using as part of my writing toolkit.

– Suramya

June 28, 2023

Please stop shoving ChatGPT Integration into products that don’t need it

I am getting really tired of folks shoving ChatGPT integration into everything whether it makes sense or not. The latest silliness is an electric bike with ChatGPT integration. I understand the desire to integrate GPS/Maps etc in a bike, although personally I would rather use an independent device which would get updates more frequently than the built in GPS where the maps might get updated a few times a year. Unless the maps are getting downloaded live using 3G/4G/whatever. I even understand the desire to integrate voice recognition in the setup so that the user can talk to it. But why on earth do I want/need to have ChatGPT shoved in there?

Based on ChatGPT’s well known tendency to hallucinate there is a good probability that it might decide that you should take a path that is not safe or even dump you into the ocean because it hallucinated that it was the way to go. This is the same thing we saw with Blockchain a few years ago, everything was suddenly on the Blockchain whether it needed to be or not. The sad part is that these folks are going to make a ton of money because of the hype behind ChatGPT and then bail leaving the consumers with a sub-par bike that hallucinates.

Source: Urtopia Unveils the World’s First Smart E-Bike with ChatGPT Integration at EUROBIKE 2023

– Suramya

Powered by WordPress