GioCities

blogs by Gio

Atom feed 🖱 cyber

🖱 Reddit: Your API *IS* Your Product

  • Posted in cyber

Reddit is going the same route as Twitter by making “API access” prohibitively expensive. This is something they very famously, very vocally said they would not do, but they’re doing it anyway. This is very bad for Reddit, but what’s worse is it’s becoming clear that companies think that this is a remotely reasonable thing to do, when it’s very critically not.

It’s the same problem we see with Twitter and other late-capitalist hell websites: Reddit’s product is the service it provides, which is its API. The ability for users to interact with the service isn’t an auxiliary premium extra, it’s the whole caboodle!

I’ll talk about first principles first, and then get into what’s been going on with Reddit and Apollo. The Apollo drama is very useful in that it directly converts the corporate bullshit that sounds technical enough to make sense into something very easy to understand: a corporation hurting them, today, for money.

The API is the product🔗

Reddit and all these other companies who are making user-level API access prohibitively expensive have forgotten that the API is the product. - The API is the interface that lets you perform operations on the site. The operations a user can do are the product, they’re not auxiliary to it!

“Application programming interface” is a very formal, internal-sounding term for a system that is none of those things. The word “programming” in the middle comes from an age where using a personal computer at all was considered “programming” it.

What an API really is a high-level interface to the web application that is Reddit. Every action a user can take — viewing posts, posting, voting, commenting — goes from the app (which interfaces with the user) to the API (which interfaces with the Reddit server), gets processed by the server using whatever-they-use-it-doesn’t-matter, and the response is sent back to the user.

The API isn’t a god mode and it doesn’t provide any super-powers. It doesn’t let you do anything you can’t do as a user, as clearly evidenced by the fact that all the actions you do on the Reddit website go through the API too.

The Reddit website, the official Reddit app, and the Apollo app all interface with the user in different ways and on different platforms, but go through the same API to interact with what we understand as “Reddit”. The fact that the API is the machine interface without the human interface should also concisely explain why “API access” is all Apollo needs to build its own app.

Right now, you can view the announcement thread at https://www.reddit.com/r/apolloapp/comments/144f6xm/apollo_will_close_down_on_june_30th_reddits/, and you can view the “API” data for the same thread at https://www.reddit.com/r/apolloapp/comments/144f6xm/apollo_will_close_down_on_june_30th_reddits.json. It’s not very fun to look at, but it’s easy to tell what you’re looking at: the fundamental representation of the page without all the trappings of the interface.

Public APIs are good for both the user and the company. They’re a vastly more efficient way for people to interact with the service than by automating interaction (or “scraping”). Having an API cuts out an entire layer of expense that, without an API, Reddit would pay for.

The Reddit service is the application, and you interface with it through WHATEVER. Whatever browser you want, whatever browser extensions you want, whatever model phone you want, whatever app you want. This is fundamentally necessary for operability and accessibility.

The API is the service. The mechanical ability to post and view and organize is what makes Reddit valuable, not its frontend. Their app actually takes the core service offering and makes it less attractive to users, which is why they were willing to pay money for an alternative!

🖱 So you want to write an AI art license

  • Posted in cyber

Hi, The EFF, Creative Commons, Wikimedia, World Leaders, and whoever else,

Do you want to write a license for machine vision models and AI-generated images, but you’re tired of listening to lawyers, legal scholars, intellectual property experts, media rightsholders, or even just people who use any of the tools in question even occasionally?

You need a real expert: me, a guy whose entire set of relevant qualifications is that he owns a domain name. Don’t worry, here’s how you do it:

This is an extremely condensed set of notes, designed as a high-level overview for thinking about the problem

Given our current system of how AI models are trained and how people can use them to generate new art, which is this:

sequenceDiagram
    Alice->>Model: Hello. Here are N images and<br>text descriptions of what they contain.
    Model->>Model: Training (looks at images, "makes notes", discards originals)
    Model->>Alice: OK. I can try to make similar images from my notes,<br>if you tell me what you want.
    Curio->>Model: Hello. I would like a depiction of this new <br>thing you've never seen before.
    Model->>Curio: OK. Here are some possibilites.

The works🔗

The model and the works produced with the model are both distinct products. The model is more like processing software or tooling, while the artistic works created with the model are distinctly artistic/creative output.

Models do not keep the original images they were trained on in any capacity. The only keep mathematical notes about their properties. You (almost always) cannot retrieve the original image data used from the model after training.

sequenceDiagram
    Curio->>Model: Send me a copy of one of the images you were trained on
    Model->>Curio: Sorry, I do not remember any of them exactly,<br>only general ideas on how to make art.

There is a lot of misinformation about this, but it is simply, literally the case that a model does not include the training material, and cannot reproduce its training material. While not trivial (you can’t have a model if you can’t train it at all), when done properly, the specific training data is effectively incidental.

AI-generated art should be considered new craftsmanship — specifically, under copyright law, it is new creative output with its own protections — and not just a trivial product of its inputs.

Plagiarism🔗

The fact that AI art is new creative output doesn’t mean AI art can’t be plagiarism.

Just like with traditional art, it’s completely possible for specific products to be produced to be copies, but that doesn’t make that the case for all works in the medium. You can trace someone else’s artwork, but that doesn’t make all sketches automatically meritless works.

The inner workings of tools used in the creation of an artistic work are not what determines if a given product is plagiarism, or if it infringes on a copyright. Understanding the workings of the tool can be used in determining if a work is an infringement, but it is not the deciding factor.

🖱 Replika: Your Money or Your Wife

  • Posted in cyber

If1 you’ve been subjected to advertisements on the internet sometime in the past year, you might have seen advertisements for the app Replika. It’s a chatbot app, but personalized, and designed to be a friend that you form a relationship with.

That’s not why you’d remember the advertisements though. You’d remember the advertisements because they were like this:

Replika "Create your own AI friend" "I've been missing you" hero ad

Replika ERP ad, Facebook (puzzle piece meme) Replika ERP ad, Instagram

And, despite these being mobile app ads (and, frankly, really poorly-constructed ones at that) the ERP function was a runaway success. According to founder Eugenia Kuyda the majority of Replika subscribers had a romantic relationship with their “rep”, and accounts point to those relationships getting as explicit as their participants wanted to go:

erp1

So it’s probably not a stretch of the imagination to think this whole product was a ticking time bomb. And — on Valentine’s day, no less — that bomb went off. Not in the form of a rape or a suicide or a manifesto pointing to Replika, but in a form much more dangerous: a quiet change in corporate policy.

Features started quietly breaking as early as January, and the whispers sounded bad for ERP, but the final nail in the coffin was the official statement from founder Eugenia Kuyda:

“update” - Kuyda, Feb 12 These filters are here to stay and are necessary to ensure that Replika remains a safe and secure platform for everyone.

I started Replika with a mission to create a friend for everyone, a 24/7 companion that is non-judgmental and helps people feel better. I believe that this can only be achieved by prioritizing safety and creating a secure user experience, and it’s impossible to do so while also allowing access to unfiltered models.

People just had their girlfriends killed off by policy. Things got real bad. The Replika community exploded in rage and disappointment, and for weeks the pinned post on the Replika subreddit was a collection of mental health resources including a suicide hotline.

Resources if you're struggling post

Cringe!🔗

First, let me deal with the elephant in the room: no longer being able to sext a chatbot sounds like an incredibly trivial thing to be upset about, and might even be a step in the right direction. But these factors are actually what make this story so dangerous.

These unserious, “trivial” scenarios are where new dangers edge in first. Destructive policy is never just implemented in serious situations that disadvantage relatable people first, it’s always normalized by starting with edge cases and people who can be framed as Other, or somehow deviant.

It’s easy to mock the customers who were hurt here. What kind of loser develops an emotional dependency on an erotic chatbot? First, having read accounts, it turns out the answer to that question is everyone. But this is a product that’s targeted at and specifically addresses the needs of people who are lonely and thus specifically emotionally vulnerable, which should make it worse to inflict suffering on them and endanger their mental health, not somehow funny. Nothing I have to content-warning the way I did this post is funny.

Virtual pets🔗

So how do we actually categorize what a replika is, given what a novel thing it is? What is a personalized companion AI? I argue they’re pets.

🖱 Lies, Damned Lies, and Subscriptions

  • Posted in cyber

Everybody hates paying subscription fees. At this point most of us have figured out that recurring fees are miserable. Worse, they usually seem unfair and exploitative. We’re right about that much, but it’s worth sitting down and thinking through the details, because understanding the exceptions teaches us what the problem really is. And it isn’t just “paying people money means less money for me”; the problem is fundamental to what “payment” even is, and vitally important to understand.

Human Agency: Why Property is Good🔗

or, “Gio is not a marxist, or if he is he’s a very bad one”

First: individual autonomy — our agency, our independence, and our right to make our own choices about our own lives — is threatened by the current digital ecosystem. Our tools are powered by software, controlled by software, and inseparable from their software, and so the companies that control that software have a degree of control over us proportional to how much of our lives relies on software. That’s an ever-increasing share.

🖱 The Failure of Account Verification

  • Posted in cyber

The “blue check” — a silly colloquialism for an icon that’s not actually blue for the at least 50% of users using dark mode — has become a core aspect of the Twitter experience. It’s caught on other places too; YouTube and Twitch have both borrowed elements from it. It seems like it should be simple. It’s a binary badge; some users have it and others don’t. And the users who have it are designated as… something.

In reality it’s massively confused. The first problem is that “something”: it’s fundamentally unclear what the significance of verification is. What does it mean? What are the criteria for getting it? It’s totally opaque who actually makes the decision and what that process looks like. And what does “the algorithm” think about it; what effects does it actually have on your account’s discoverability?

This mess is due to a number of fundamental issues, but the biggest one is Twitter’s overloading the symbol with many conflicting meanings, resulting in a complete failure to convey anything useful.

xkcd twitter_verification

History of twitter verification🔗

Twitter first introduced verification in 2009, when baseball man Tony La Russa sued Twitter for letting someone set up a parody account using his name. It was a frivolous lawsuit by a frivolous man who has since decided he’s happy using Twitter to market himself, but Twitter used the attention to announce their own approach to combating impersonation on Twitter: Verified accounts.

🖱 You can Google it

  • Posted in cyber

The other day I had a quick medical question (“if I don’t rinse my mouth out enough at night will I die”), so I googled the topic as I was going to bed. Google showed a couple search results, but it also showed Answers in a little dedicated capsule. This was right on the heels of the Yahoo Answers shutdown, so I poked around to see what Google’s answers were like. And those… went in an unexpected direction.

Should I rince my mouth after using mouthwash? Why is it bad to swallow blood? Can a fly live in your body? What do vampires hate? Can you become a vampire? How do you kill a vampire?

So, Google went down a little rabbit trail. Obviously these answers were scraped from the web, and included sources like exemplore.com/paranormal/ which is, apparently, a Wiccan resource for information that is “astrological, metaphysical, or paranormal in nature.” So possibly not the best place to go for medical advice. (If you missed it, the context clue for that one was the guide on vampire killing.)

There are lots of funny little stories like this where some AI misunderstood a question. Like this case where a porn parody got mixed in the bio for a fictional character, or that time novelist John Boyne used Google and accidently wrote a video recipe into his book. (And yes, it was a Google snippet.) These are always good for a laugh.

Wait, what’s that? That last one wasn’t funny, you say? Did we just run face-first toward the cold brick wall of reality, where bad information means people die?

Well, sorry. Because it’s not the first time Google gave out fatal advice, nor the last. Nor is there any end in sight. Whoops!

🖱 Client CSAM scanning: a disaster already

  • Posted in cyber

Update 2023: I won.

On August 5, 2021, Apple presented their grand new Child Safety plan. They promised “expanded protections for children” by way of a new system of global phone surveillance, where every iPhone would constantly scan all your photos and sometimes forward them to local law enforcement if it identifies one as containing contraband. Yes, really.

August 5 was a Thursday. This wasn’t dumped on a Friday night in order to avoid scrutiny, this was published with fanfare. Apple really thought they had a great idea here and expected to be applauded for it. They really, really didn’t. There are almost too many reasons this is a terrible idea to count. But people still try things like this, so as much as I wish it were, my work is not done. God has cursed me for my hubris, et cetera. Let’s go all the way through this, yet again.

The architectural problem this is trying to solve🔗

Believe it or not, Apple actually does address a real architectural issue here. Half-heartedly addressing one architectural problem of many doesn’t mean your product is good, or even remotely okay, but they do at least do it. Apple published a 14 page summary of the problem model (starting on page 5). It’s a good read if you’re interested in that kind of thing, but I’ll summarize it here.

🖱 Ethical Source is a Crock of Hot Garbage

  • Posted in cyber

There’s this popular description of someone “having brain worms”. It invokes the idea of having your mind so thoroughly infested with an idea to the point of disease. As with the host of an infestation, such a mind is poor-to-worthless at any activity other than sustaining and spreading the parasite.

A “persistent delusion or obsession”. You know, like when you think in terms of legality so much you can’t even make ethical evaluations anymore, or when you like cops so much you stop being able to think about statistics, or the silicon valley startup people who try to solve social problems with bad technology, or the bitcoin people who responded to the crisis in Afghanistan by saying they should just adopt bitcoin. “Bad, dumb things”. You get the idea.

And, well.

Okay, so let’s back way up here, because this is just the tip of the iceberg of a story that needs years of context. I’ll start with the most recent event here, the Mastodon tweet.

The Mastodon Context🔗

The “he” Mastodon is referring to is ex-president-turned-insurrectionist Donald Trump, who, because his fellow-insurrectionist friends and fans are subject to basic moderation policies on most of the internet, decided to start his own social network, “Truth Social”. In contrast to platforms moderated by the “tyranny of big tech”, Truth Social would have principles of Free Speech, like “don’t read the site”, “don’t link to the site”, “don’t criticise the site”, “don’t use all-caps”, and “don’t disparage the site or us”. There are a lot of problems here already, but because everything Trump does is terrible and nobody who likes him can create anything worthwhile, instead of actually making a social networking platform, they just stole Mastodon wholesale.

Mastodon is an open-source alternative social networking platform. It’s licensed under an open license (the AGPLv3), so you are allowed to clone it and even rebrand it for your own purposes as was done here. What you absolutely are not allowed to do is claim the codebase is your own proprietary work, deliberately obscure the changes you made to the codebase, or make any part of the AGPL-licensed codebase (including your modifications) unavailable to the public. All of which Truth Social does.

So that’s the scandal. And so here’s Mastodon poking some fun at that.

🖱 Is (git) master a dirty word?

  • Posted in cyber

Git is changing. GitHub, GitLab, and the core git team have a made a system of changes to phase out the use of the word “master” in the development tool, after a few years of heated (heated) discussion. Proponents of the change argue “slavery is bad”, while opponents inevitably end up complaining about the question itself being “overly political”. Mostly. And, with the tendency of people in the computer science demographic to… let’s call it “conservatism”, this is an issue that gets very heated, very quickly. I have… thoughts on this, in both directions.

Formal concerns about problematic terminology in computing (master, slave, blacklist) go back as early as 2003, at the latest; this is not a new conversation. The push for this in git specifically started circa 2020. There was a long thread on the git mailing list that went back and forth for several months with no clear resolution. It cited Python’s choice to move away from master/slave terminology, which was formally decided on as a principle in 2018. In June of 2020, the Software Freedom Conservancy issued an open letter decrying the term “master” as “offensive to some people.” In July 2020 github began constructing guidance to change the default branch name and in 2021 GitLab announced it would do the same.


First, what role did master/slave terminology have in git, anyway? Also, real quick, what’s git? Put very simply, git is change tracking software. Repositories are folders of stuff, and branches are versions of those folders. If you want to make a change, you copy the file, modify it, and slot it back in. Git helps you do that and also does some witchery to allow multiple people to make changes at the same time without breaking things, but that’s not super relevant here.

That master version that changes are based is called the master branch, and is just a branch named master. Changes are made on new branches (that start as copies of the master branch) which can be named anything. When the change is final, it’s merged back into the master branch. Branches are often deleted after they’re merged.

🖱 YouTube broke links and other life lessons

  • Posted in cyber

This morning YouTube sent out an announcement that, in one month, they’re going to break all the links to all unlisted videos posted prior to 2017. This is a bad thing. There’s a whole lot bad here, actually.

Edit: Looks like Google is applying similar changes to Google Drive, too, meaning this doesn’t just apply to videos, but to any publicly shared file link using Google Drive. As of next month, every public Google Drive link will stop working unless the files are individually exempted from the new security updates, meaning any unmaintained public files will become permanently inaccessible. Everything in this article still applies, the situation is just much worse than I thought.

The Basics🔗

YouTube has three kinds of videos: Public, Unlisted, and Private. Public videos are the standard videos that show up in searches. Private videos are protected, and can only be seen by specific YouTube accounts you explicitly invite. Unlisted videos are simply unlisted: anyone with the link can view, but the video doesn’t turn up automatically in search results.

Unlisted videos are obviously great, for a lot of reasons. You can just upload videos to YouTube and share them with relevant communities — embed them on your pages, maybe — without worrying about all the baggage of YouTube as a Platform.

What Google is trying to do here is roll out improvements they made to the unlisted URL generation system to make it harder for bots and scrapers to index videos people meant to be semi-private. This is a good thing. The way they’re doing it breaks every link to the vast majority of unlisted videos, including shared links and webpage embeds. This is a tremendously bad thing. I am not the first to notice this.

See, I just kind of sighed when I saw this, because this isn’t the first time I’ve lived through it. On March 15, 2017, Dropbox killed their public folder. Prior to that, Dropbox had a service where you could upload files to a special “Public” folder. This let you easily share links to those files with anyone — or groups of people — without having to explicitly invite them by email, and make them register a Dropbox account. Sound familiar?

🖱 Twitter Blue is a late-stage symptom

  • Posted in cyber

Twitter Blue! $5/mo for Premium Twitter. It’s the latest thing that simply everyone.

News articles about twitter blue

I have an issue with it, but over a very fundamental point, and one Twitter shares with a lot of other platforms. So here’s why it’s bad that Twitter decided to put accessibility features behind a paywall, and it isn’t the obvious.

Client/Server architecture in 5 seconds🔗

All web services, Twitter included, aren’t just one big magic thing. You can model how web apps work as two broad categories: the client and the server. The client handles all your input and output: posts you make, posts you see, things you can do. The server handles most of the real logic: what information gets sent to the client, how posts are stored, who is allowed to log in as what accounts, etc.

🖱 How Apple Destroyed Mobile Freeware

  • Posted in cyber

I have a memory from when I was very young of my dad doing the finances. He would sit in his office with a computer on one side and an old-fashioned adding machine on the desk. While he worked on the spreadsheet on the computer, he would use the adding machine for quick calculations.

Adding machine

A year or two ago I had a very similar experience. I walked upstairs to the office and there he was, at the same desk, spreadsheet on one side and calculator on the other. Except it was 2020, and he had long ago replaced the adding machine with an iPad. There was really one noticeable difference between the iPad and the old adding machine: the iPad was awful at the job. My dad was using some random calculator app that was an awkwardly scaled iPhone app with an ugly flashing banner add at the bottom.