• 0 Posts
  • 41 Comments
Joined 6 months ago
cake
Cake day: March 3rd, 2024

help-circle

  • language is intrinsically tied to culture, history, and group identity, so any concept that is expressed through a certain linguistic system is inseparable from its cultural roots

    i feel like this is a big part of it. it reminds me of the Sapir Whorf Hypothesis. search results and neural networks are susceptible to bias just like a human is; “garbage in garbage out” as they say.

    the quote directly after mentions that newer or more precise searches produce more coherent results across languages. that reminds me of the time i got curious and looked up Marxism on Conservapedia. as you might expect, the high level descriptions of Marxism are highly critical and include a lot of bias, but interestingly once you dig down to concepts like historical materialism etc it gets harder to spin, since popular media narratives largely ignore those details and any “spin” would likely be blatant falsehood.

    the author of the article seems to really want there to be a malicious conspiratorial effort to suppress information, and, while that may be true in some cases, it just doesn’t seem feasible at scale. this is good to call out, but i don’t think these people who concern their lives with the research and advancement of language concepts are sleeping on the fact that bias exists.


  • it’s super weird that people think LLMs are so fundamentally different from neural networks, the underlying technology. neural network architectures are constantly improving, and LLMs are just a product of a ton of research and an emergence after the discovery of the transformer architecture. what LLMs have shown us is that we’re definitely on the right track using neural networks to solve a wide range of problems classified as “AI”

















  • yeah i see that too. it seems like mostly a reactionary viewpoint. the reaction is understandable to a point since a lot of the “AI” features are half baked and forced on the user. to that point i don’t think GNOME etc should be scrambling to add copies of these features.

    what i would love to see is more engagement around additional pieces of software that are supplemental. for example, i would love if i could install a daemon that indexes my notes and allows me to do semantic search. or something similar with my images.

    the problems with AI features aren’t within the tech itself but in the surrounding politics. it’s become commonplace for “responsible” AI companies like OpenAI to not even produce papers around their tech (product announcement blogs that are vaguely scientific don’t count), much less source code, weights, and details on training data. and even when Meta releases their weights, they don’t specify their datasets. the rat race to see who can make a decent product with this amazing tech has made the whole industry a bunch of pearl clutching FOMO based tweakers. that sparks a comparison to blockchain, which is fair from the perspective of someone who hasn’t studied the tech or simply hasn’t seen a product that is relevant to them. but even those people will look at something fantastical like ChatGPT as if it’s pedestrian or unimpressive because when i asked it to write an implementation of the HTTP spec in the style of Fetty Wap it didn’t run perfectly the first time.


  • a lot of things are unknown.

    i’d be very surprised if it doesn’t have an opt out.

    a point i was trying to make is that a lot of this info already exists on their servers, and your trust in the privacy of that is what it is. if you don’t trust them that it’s run on per user virtualized compute, that it’s e2e encrypted, or that they’re using local models i don’t know what to tell you. the model isn’t hoovering up your messages and sending them back to Apple unencrypted. it doesn’t need to for these features.

    all that said, this is just what they’ve told us, and there aren’t many people who know exactly what the implementation details are.

    the privacy issue with Recall, as i said, is that it collects a ton of data passively, without explicit consent. if i open my KeePass database on a Recall enabled machine, i have little assurance that this bot doesn’t know my Gmail password. this bot uses existing data, in controlled systems. that’s the difference. sure maybe people see Apple as more trustworthy, but maybe sociology has something to do with your reaction to it as well.