Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> We’re releasing Llama 3.1 405B, the first frontier-level open source AI model, as well as new and improved Llama 3.1 70B and 8B models.

Bravo! While I don't agree with Zuck's views and actions on many fronts, on this occasion I think he and the AI folks at Meta deserve our praise and gratitude. With this release, they have brought the cost of pretraining a frontier 400B+ parameter model to ZERO for pretty much everyone -- well, everyone except Meta's key competitors.[a] THANK YOU ZUCK.

Meanwhile, the business-minded people at Meta surely won't mind if the release of these frontier models to the public happens to completely mess up the AI plans of competitors like OpenAI/Microsoft, Google, Anthropic, etc. Come to think of it, the negative impact on such competitors was likely a key motivation for releasing the new models.

---

[a] The license is not open to the handful of companies worldwide which have more than 700M users.



Look, absolutely zero people in the world should trust any tech company when they say they care about or will keep commitments to the open-source ecosystem in any capacity. Nevertheless, it is occasionally strategic for them to do so, and there can be ancillary benefits for said ecosystem in those moments where this is the best play for them to harm their competitors

For now, Meta seems to release Llama models in ways that don't significantly lock people into their infrastructure. If that ever stops being the case, you should fork rather than trust their judgment. I say this knowing full well that most of the internet is on AWS or GCP, most brick and mortar businesses use Windows, and carrying a proprietary smartphone is essentially required to participate in many aspects of the modern economy. All of this is a mistake. You can't resist all lock-in. The players involved effectively run the world. You should still try where you can, and we should still be happy when tech companies either slip up or make the momentary strategic decision to make this easier


> If that ever stops being the case, you should fork rather than trust their judgment.

Fork what? The secret sauce is in the training data and infrastructure. I don't think either of those is currently open.


I'm just a lowly outsider to the AI space, but calling these open source models seems kind of like calling a compiled binary open source.

If you don't have a way to replicate what they did to create the model, it seems more like freeware than open source.


As an ML researcher, I agree. Meta doesn't include adequate information to replicate the models, and from the perspective of fundamental research, the interest that big tech companies have taken in this field has been a significant impediment to independent researchers, despite the fact that they are undeniably producing groundbreaking results in many respects, due to this fundamental lack of openness

This should also make everyone very skeptical of any claim they are making, from benchmark results to the legalities involved in their training process to the prospect of future progress on these models. Without being able to vet their results against the same datasets they're using, there is no way to verify what they're saying, and the credulity that otherwise smart people have been exhibiting in this space has been baffling to me

As a developer, if you have a working Llama model, including the source code and weights, and it's crucial for something you're building or have already built, it's still fundamentally a good thing that Meta isn't gating it behind an API and if they went away tomorrow, you could still use, self-host, retrain, and study the models


Which option would be better?

A) Release the data, and if it ends up causing a privacy scandal, at least you can actually call it open this time.

B) Neuter the dataset, and the model

All I ever see in these threads is a lot of whining and no viable alternative solutions (I’m fine with the idea of it being a hard problem, but when I see this attitude from “researchers” it makes me less optimistic about the future)

> and the credulity that otherwise smart people have been exhibiting in this space has been baffling to me

Remove the “otherwise” and you’re halfway to understanding your error.


This isn't a dilemma at all. If Facebook can't release data it trains on because it would compromise user privacy, it is already a significant privacy violation that should be a scandal, and if it would prompt some regulatory or legislative remedies against Facebook for them to release the data, it should do the same for releasing the trained model, even through an API. The only reason people don't think about it this way is that public awareness of how these technologies work isn't pervasive enough for the general public to think it through, and it's hard to prove definitively. Basically, if this is Facebook's position, it's saying that the release of the model already constitutes a violation of user privacy, but they're betting no one will catch them

If the company wants to help research, it should full-throatedly endorse the position that it doesn't consider it a violation of privacy to train on the data it does, and release it so that it can be useful for research. If the company thinks it's safeguarding user privacy, it shouldn't be training models on data it considers private and then using them in public-facing ways at all

As it stands, Facebook seems to take the position that it wants to help the development of software built on models like Llama, but not really the fundamental research that goes into building those models in the same way


> If Facebook can't release data it trains on because it would compromise user privacy, it is already a significant privacy violation that should be a scandal

Thousands of entities would scramble to sue Facebook over any released dataset no matter what the privacy implications of the dataset are.

It's just not worth it in any world. I believe you are not thinking of this problem from the view of the PM or VPs that would actually have to approve this: if I were a VP and I was 99% confident that the dataset had no privacy implications, I still wouldn't release it. Just not worth the inevitable long, drawn out lawsuits from people and regulators trying to get their pound of flesh.

I feel the world is too hostile to big tech and AI to enable something like this. So, unless we want to kill AGI development in the cradle, this is what we get - and we can thank modern populist techno-pessimism for cultivating this environment.


Translation: "we train our data on private user data and copyrighted material so of course we cannot disclose any of our datasets or we'll be sued into oblivion"

There's no AGI development in the cradle. And the world isn't "hostile". The world is increasingly tired of predatory behavior by supranational corporations


> I feel the world is too hostile to big tech

Lmao what? If the world were sane and hostile to big tech, we would've nuked them all years ago for all the bullshit they pulled and continue to pull. Big tech has politicians in their pockets, but thankfully the "populist techno-pessimist" (read: normal people who are sick of billionaires exploiting the entire planet) are finally starting to turn their opinions, albeit slowly.

If we lived in a sane world Cambridge Analytica would've been the death knell of Facebook and all of the people involved with it. But we instead live in a world where psychopathic pieces of shit like Zucc get away with it, because they can just buy off any politician who knocks on their doors.


> normal people who are sick of billionaires exploiting the entire planet

Don't understand what big tech does for humanity and how much they rely on it in the day to day. Literally all of their modern conveniences are enabled by big tech.


Rather dismissive particularly as Crowdstrike has laid a good chunk of that bare.

In my experience many 'normal people' understand far more than you deign credit, many are able to forgo modern 'conveniences' if pressed.


Crowdstrike merely shows how much people depend on big tech and they don't even realize how much they rely on it.

I think you have too much faith in the average person. They scarcely understand how nearly everything in their life has been manufactured on or designed on something powered by big tech.


This post demonstrates a willful ignorance of the factors driving so-called "populist techno-pessimism" and I'm sure every time a member of the public is exposed to someone talking like this, their "techno-pessimism" is galvanized

The ire people have toward tech companies right now is, like most ire, perhaps in places overreaching. But it is mostly justified by the real actions of tech companies, and facebook has done more to deserve it than most. The thought process you just described sounds like an accurate prediction of the mindset and culture of a VP within Facebook, and I'd like you to reflect on it for a sec. Basically, you rightly point out that the org releasing what data they have would likely invite lawsuits, and then you proceeded to do some kind of insane offscreen mental gymnastics that allow this reality to mean nothing to you but that the unwashed masses irrationally hate the company for some unknowable reason

Like you're talking about a company that has spent the last decade buying competitors to maintain an insane amount of control over billions of users' access to their friends, feeding them an increasingly degraded and invasive channel of information that also from time to time runs nonconsensual social experiments on them, and following even people who didn't opt in around the internet through shady analytics plugins in order to sell dossiers of information on them to whoever will pay. What do you think it is? Are people just jealous of their success, or might they have some legit grievances that may cause them to distrust and maybe even loathe such an entity? It is hard for me to believe Facebook has a dataset large enough to train a current-gen LLM that wouldn't also feel, viscerally, to many, like a privacy violation. Whether any party that felt this way could actually win a lawsuit is questionable though, as the US doesn't really have signficant privacy laws, and this is partially due to extensive collaboration with, and lobbying by, Facebook and other tech companies who do mass-surveillance of this kind

I remember a movie called Das Leben der Anderen (2006) (Officially translated as "the lives of others") which got accolades for how it could make people who hadn't experienced it feel how unsettling the surveillance state of East Germany was, and now your average American is more comprehensively surveilled than the Stasi could have imagined, and this is in large part due to companies like facebook

Frankly, I'm not an AGI doomer, but if the capabilities of near-future AI systems are even in the vague ballpark of the (fairly unfounded) claims the American tech monopolies make about them, it would be an unprecedented disaster on a global scale if those companies got there first, so inasmuch as we view "AGI research" as something that's inevitably going to hit milestones in corporate labs with secretive datasets, I think we should absolutely kill it to whatever degree is possible, and that's as someone who truly, deeply believes that AI research has been beneficial to humanity and could continue to become moreso


> Release the data, and if it ends up causing a privacy scandal...

We can't prove that a model like llama will never produce a segment of its training data set verbatim.

Any potential privacy scandal is already in motion.

My cynical assumption is that Meta knows that competitors like OpenAI have PR-bombs in their trained model and therefore would never opensource the weights.


The model is public, so you can at least verify their benchmark claims.


Generally speaking, no. An important part of a lot of benchmarks in ML research is generalization. What this means is that it's often a lot easier to get a machine learning model to memorize the test cases in a benchmark than it is to train it to perform a general capability the benchmark is trying to test for. For that reason, the dataset is important, as if it includes the benchmark test cases in some way, it invalidates the test

When AI research was still mostly academic, I'm sure a lot of people still cheated, but there was somewhat less incentive to, and norms like publishing datasets made it easier to verify claims made in research papers. In a world where people don't, and there's significant financial incentive to lie, I just kind of assume they're lying


> If you don't have a way to replicate what they did to create the model, it seems more like freeware

Isn't that a bit like arguing that a linux kernel driver isn't open source if I just give you a bunch of GPL-licensed source code that speaks to my device, but no documentation how my device works? If you take away the source code you have no way to recreate it. But so far that never caused anyone to call the code not open-source. The closest is the whole GPL3 Tivoization debate and that was very divisive.

The heart of the issue is that open source is kind of hard to define for anything that isn't software. As a proxy we could look at Stallman's free software definition. Free software shares a common history with open source and in most open source software is free/libre, and the other way around, so this might be a useful proxy.

So checking the four software freedoms:

- The freedom to run the program as you wish, for any purpose: For most purposes. There's that 700M user restriction, also Meta forbids breaking the law and requires you to follow their acceptable use policy.

- The freedom to study how the program works, and change it so it does your computing as you wish: yes. You can change it by fine tuning it, and the weights allow you to figure out how it works. At least as well as anyone knows how any large neural network works, but it's not like Meta is keeping something from you here

- The freedom to redistribute copies so you can help your neighbor: Allowed, no real asterisks

- The freedom to distribute copies of your modified versions to others: Yes

So is it Free Software™? Not really, but it is pretty close.


The model is "open-source" for the purpose of software engineering, and it's "closed data" for the purpose of AI research. These are separate issues and it's not necessary to conflate them under one term


> it seems more like freeware than open source.

What would you have them do instead? Specifically?


Release the training set and the code that was used to train the model, or stop calling it open source.

If you can't fork it and take the project in your own direction, it's not open source.


They actually did open source the infrastructure library they developed. They don't open source the data but they describe how they gathered/filtered it.


A good point.

Forgive me, I am AI naive, is there some way to harness Llama to train ones own actually-open AI?


Kinda. Since you can self-host the model on a linux machine, there's no meaningful way for them to prevent you from having the trained weights. You can use this to bootstrap other models, or retrain on your own datasets, or fine-tune from the starting point of the currently-working model. What you can't do is be sure what they trained it on


How open is it really though? If you're starting from their weights, do you actually have legal permission to use derived models for commercial purposes? If it turns out that Meta used datasets they didn't have licenses to use in order to generate the model, then you might be in a big heap of mess.


From a legal perspective, yea. If we end up having any legal protection against training AI models, legal liability will be a huge mess for everyone involved. From an engineering perspective, if all you need is the pretrained weights, there's not a clear way Facebook could show up and break your product from a technological perspective, as compared to if the thing is relying on, say, an OpenAI API key rather than a self-hosted Llama instance


I could be wrong but most “model” licenses prohibit the use of the models to improve other models


That's a good point. I expect it is ultimately unenforceable though. I'm describing training a model for myself, not for sale or public consumption.


Is forking really possible with an LLM or one the size of future Lama versions, have they even released the weights and everything? Maybe I am just negative about it because I feel Meta is the worst company ever invented and feel this will hurt society in the long run just like Facebook.


> have they even released the weights?

Isn't that what the model is? just a collection weights?


When you run `ollama pull llama3.1:70b`, which you can literally do right now (assuming ollama.com is installed and you're not afraid of the terminal), and it downloads a 40 gigabyte model, that is the weights!

I'd consider the ability to admit when even your most hated adversary is doing something right, a hallmark of acting smarter.

Now, they haven't released the training data with the model weights. THAT plus the training tooling would be "end to end open source". Apple actually did that very thing recently, and it flew under almost everyone's radar for some reason:

https://x.com/vaishaal/status/1813956553042711006?s=46&t=qWa...


Doing something right vs doing something that seems right but has a hidden self interest that is harmful in the long run can be vastly different things. Often this kind of strategy will allow people to let their guard down, and those same people will get steamrolled down the road, left wondering where it all went wrong. Get smarter.


How in the heck is an open source model that is free and open today going to lock me down, down the line? This is nonsense. You can literally run this model forever if you use NixOS (or never touch your windows, macos or linux install again). Zuck can't come back and molest it. Ever.

The best I can tell is that their self-interest here is more about gathering mindshare. That's not a terrible motive; in fact, that's a pretty decent one. It's not the bully pressing you into their ecosystem with a tit-for-tat; it's the nerd showing off his latest and going "Here. Try it. Join me. Join us."


> How in the heck is an open source model that is free and open today

Is free, but it's not open source


Yeah because history isn't absolutely littered with examples of shiny things being dangled in front of people with the intent to entrap them /s.

Can you really say this model will still be useful in 2 years, 5 years for you? And that FB's stance on these models will still be open source at that time once they incrementally make improvements? Maybe, maybe not. But FB doesn't give anything away for free, and the fact that you think so is your blindness, not mine. In case you haven't figured it out, this isn't a technology problem, this is a "FB needs marketshare and it needs it fast" problem.


> But FB doesn't give anything away for free, and the fact that you think so is your blindness, not mine

Is it, though? They are literally giving this away "for free". https://dev.to/llm_explorer/llama3-license-explained-2915 Unless you build a service with it that has over 700 million monthly users (read: "problem anyone would love to have"), you do not have to re-negotiate a license agreement with them. Beyond that, it can't "phone home" or do any other sorts of nefarious shite. The other limitations there, which you can plainly read, seem not very restrictive.

Is there a magic secret clause conspiracy buried within the license agreement that you believe will be magically pulled out at the worst possible moment? >..<

Sometimes, good things happen. Sorry you're "too blinded" by past hurt experience to see that, I guess


In tech you can trust the underdogs. Once they turn into dominant players they turn evil. 99% of the cases.


Praising is good. Gratitude is a bit much. They got this big by selling user generated content and private info to the highest bidder. Often through questionable means.

Also, the underdog always touts Open Source and standards, so it’s good to remain skeptical when/if tables turn.


All said and done, it is a very expensive and balsy way to undercut competitors. They’ve spent > $5B on hardware alone, much of which will depreciate in value quickly.

Pretty sure the only reason Meta’s managed to do this is because of Zuck’s iron grip on the board (majority voting rights). This is great for Open Source and regular people though!


Zuck made a bet when they provisioned for reels to buy enough GPUs to be able to spin up another reels-sized service.

Llama is probably just running on spare capacity (I mean, sure, they've kept increasing capex, but if they're worried about an llm-based fb competitor they sort of have to in order to enact their copycat strategy)


At Meta level, spending $5B to stay competitive is not balsy. It’s a bargain.


Well, he didn't do it to be "nice", you can be sure about that. Obviously they see a financial gain somewhere/sometime


I'm perfectly happy with them draining the life essence out of the people crazy enough to still use Facebook, if they're funneling the profits into advancing human progress with AI. It's an Alfred Nobel kind of thing to do.


It's not often you see a take this bad on HN. Wow!

You are aware Facebook tracks everyone, not just people with Facebook accounts, right? They have a history of being anti-consumer in every sense of the word. So while I can understand where you're coming from, it's just not anywhere close to being reality.

If you want to or not, if you consent or not, Facebook is tracking and selling you.


Oh no! Facebook knows who I am!

No they are not selling me. How can they sell my attention to advertisers when I don't look at their ads? How can they influence me if I don't engage with their algorithm? You're the one who's trying to sell me your fear and mistrust.


>selling user generated content and private info to the highest bidder

Was always their modus operandi, surely. How else would they have survived.

Thanks for returning everyone else;s content and never mind all the content stealing your platform did.


> the AI folks at Meta deserve our praise and gratitude

We interviewed Thomas who led Llama 2 and 3 post training here in case you want to hear from someone closer to the ground on the models https://www.latent.space/p/llama-3


"Come to think of it, the negative impact on such competitors was likely a key motivation for releasing the new models."

"Commoditize Your Complement" is often cited here: https://gwern.net/complement


Makes me wonder why he's really doing this. Zuckerberg being Zuckerberg, it can't be out of any genuine sense of altruism. Probably just wants to crush all competitors before he monetizes the next generation of Meta AI.


Its certainly not altruism. Given that Facebook/Meta owns the largest user data collection systems, any advancement in AI ultimately strengthens their business model (which is still mostly collecting private user data, amassing large user datasets, and selling targeting ads).

There is a demo video that shows a user wearing a Quest VR headset and asks the AI "what do you see" and it interprets everything around it. Then, "what goes well with these shorts"... You can see where this is going. Wearing headsets with AIs monitoring everything the users see and collecting even more data is becoming normalized. Imagine the private data harvesting capabilities of the internet but anywhere in the physical world. People need not even choose to wear a Meta headset, simply passing a user with a Meta headset in public will be enough to have private data collected. This will be the inevitable result of vision models improvements integrated into mobile VR/AR headsets.


That's very dystopian. It's bad enough having cameras everywhere now. I never opted in to being recorded.


That sounds fantastic. If they make the Meta headset easy to wear and somewhat fashionable (closer to eyeglass than to a motorcycle helmet), I'd take it everywhere and record everything. Give me a retrospective search and conferences/meetings will be so much easier (I am terrible with names).


I wouldn’t even say hi alone my name to someone wearing a Meta headset out in public. And if facial recognition becomes that common for wearers, most of the population is going to adorn something to prevent that. And if it’s at work, I’m not working there and I have to think many would agree. Coworkers don’t and wouldn’t tolerate coworkers taking videos or pictures of them.


This is not how the overwhelming majority of the world works though.

> if facial recognition becomes that common for wearers, most of the population is going to adorn something to prevent that

"Most of the population" is going to be "the wearers".

> Coworkers don’t and wouldn’t tolerate coworkers taking videos or pictures of them.

Here is a fun experience you can try: just hit "record" on every single Teams or Meet meeting you're ever on (or just set recording as the default setting in the app).

See how many coworkers comment on it, let alone protest.

I can tell you from experience (of having been in thousands of hours of recorded meetings in the last 3 years) that the answer is zero.


You are probably right, but that is truly a cyberpunk dystopian situation. A few megacorps will catalog every human interaction and there will be no way to opt out.


Of course, no Hacker News thread is complete without the "I would never shake hands with an Android user" guy who just has to virtue signal.

> And if facial recognition becomes that common for wearers, most of the population is going to adorn something to prevent that

My brother in Christ, you sincerely underestimate how much "most of the population" gives a shit. Most people are being tracked by Google Maps or FindMy, are triangulated with cell towers that know their exact coordinates, and willingly use social media that profiles them individually. The population doesn't even try in the slightest to resist any of it.


[flagged]


> I think you dramatically overestimate the number of people that would actually care about any theoretical privacy infringement

Not really surprised that you don't see it as a problem

> This is a very antiquated view IMO. You are already being filmed and monitored at work.

Not really surprised that you don't see it as a problem


> a privacy-aware secure LLM

Funniest thing I've heard all month.


Do read through the linked article. Not sure how you could make cloud compute more private if you tried; apart from homomorphic encryption.


I really think the value of this for Meta is content generation. More open models (especially state of the art) means more content is being generated, and more content is being shared on Meta platforms, so there is more advertising revenue for Meta.


He's not even pretending it's altruism. Literally about 1/3 of the entire post is the section titled "Why Open Source AI Is Good for Meta". I find it really weird that there are whole debates in threads here about whether it's altruistic when Zuckerberg isn't making that claim in the first place.


All the content generated by llms (good or bad) is going to end up back in Facebook/Instagram and other social media sites. This enables Meta to show growth and therefore demand a higher stock price. So it makes sense to get content generation tools out there as widely as possible.


Zuckerberg didn't really say anything about altruism. The point he was making is an explicit "I believe open models are best for our business"

He was clear in that one of their motivations is avoiding vendor lockin. He doesn't want Meta to be under the control of their competitors or other AI providers.

He also recognizes the value brought to his company by open sourcing products. Just look at React, PyTorch, and GraphQL. All industry standards, and all brought tremendous value to Facebook.


You can always listen to the investor calls for the capitalist point of view. In short, attracting talent, building the ecosystem, and making it really easy for users to make stuff they want to share on Meta's social networks


He addresses this pretty clearly in the post. They don't want to be beholden to other companies to build the products they want to build. Their experience being under Apple's thumb on mobile strongly shaped this point of view.


Don't be fooled, it is a "embrace extend extinguish" strategy. Once they have enough usage and be the default standard they will start to find any possible ways to make you pay.


Credits where due: Facebook didn't do that with React or PyTorch. Meta will reap benefit for sure, but they don't seem to be betting on selling the model itself, rather they will benefit from being at the forefront of a new ecosystem.


Hasn't really happened with PyTorch or any of their other open sourced releases tbh.


There's nothing open source about it.

It's a proprietary dump of data you can't replicate or verify.

What were the sources? What datasets it was trained on? What are the training parameters? And so on and so on


> they have brought the cost of pretraining a frontier 400B+ parameter model to ZERO

It is still far from zero.


If the model is already pretrained, there's no need to pretrain it, so the cost of pretraining is zero.


Yeah but you only have the one model, and so far it seems to be only good on paper.


So far, it seems like this release has done ~nothing to the stock price for GOOGL/MSFT, which we all know has been propped up largely on the basis of their AI plans. So it's probably premature to say that this has messed it up for them.


> We’re releasing Llama 3.1 405B

Is it possible to run this with ollama?


If you have the ram for it.

Ollama will offload as many layers as it can to the gpu then the rest will run on the cpu/ram.


Sure, if you have a H100 cluster. If you quant it to int4 you might get away with using only 4 H100 GPUs!


Assuming $25k a pop, that’s at least $100k in just the GPUs alone. Throw in their linking technology (NVLink) and cost for the remaining parts, won’t be surprised if you’re looking at $150k for such a cluster. Which is not bad to be honest, for something at this scale.

Can anyone share the cost of their pre-built clusters, they’ve recently started selling? (sorry feeling lazy to research atm, I might do that later when I have more time).


You can rent H100 GPUs.


you're about right.

https://smicro.eu/nvidia-hgx-h100-640gb-935-24287-0001-000-1

8x H100 HGX cluster for €250k + VAT


If you want your first token around tomorrow lunch sure


>> Bravo! While I don't agree with Zuck's views and actions on many fronts, on this occasion I think he and the AI folks at Meta deserve our praise and gratitude.

Nope. Not one bit. Supporting F/OSS when it suits you in one area and then being totally dismissive of it in every other area should not be lauded. How about open sourcing some of FB's VR efforts?


They open sourced Gear VR. The are a stark contrast to other players in terms of how they have built everything on open standards (OpenXR, WebXR, etc), and they have just opened their platform by allowing third parties to build and customise it to make their own commercial offerings. Not open source, but a quite a contrast to every other player in that industry so far.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: