Chapter 260 Why do I always feel like I’m digging a hole for myself? (Second update!)
"I think we still have a chance." Brockman glanced at Ultraman Sam beside him, then said to Musk, "Even though Grapefruit Technology from Da Zhou is currently in the lead, they've previously uploaded many examples to GitHub, and now it seems that they're all related to the Orange model, which means they're very likely to open source the Orange model soon."
“Even if they don’t open source it, they will soon publish the principles of the model.”
"Elon, openAI has the world's top research team. As long as we find the right direction, no matter how far others have gone, we will be able to catch up and even surpass them."
"As you know, Da Zhou is far behind us in the field of deep learning, both in terms of environment and technology. Therefore, their lead this time is most likely due to them accidentally finding a path that we don't know about, but is extremely correct."
"I don't think we should give up at this point, Elon."
Although Brockman was worried, he still refuted Elon Musk calmly.
Musk took a deep breath and said, "Okay, I can give you another six months. If you still can't come up with an effective response strategy after six months, I think merging OpenAI into Tesla would be a more effective solution."
"They responded! The Orange Big Model uses a sequence-to-sequence mechanism and an architecture that is a combination of feedforward neural networks and recurrent neural networks. They will release a report on the Orange Big Model's architecture at the IEEE International Conference on Communications in Kuala Lumpur on May 23rd!"
Ilya shouted loudly.
"YES!" Brockman waved his fist in his heart.
Even if they don't open source, as long as they publish a report and have a direction, it's time to compete with each other!
openAI can't lose!
05:30 California time, 20:30 Haixi time
The sky over San Francisco was just beginning to turn pale, but traffic on the Haixi Road had already started to enter its second peak, with people returning home after lunch or overtime work.
On GL8, Hua Zecheng held a pad in his hand and kept looking at the background panel of the large orange model.
They worked hard for three months before finally completing the functions of the Orange model. Although the imaging function has not been fully developed, it is already practical enough to be released.
The 300 internal testing spots this time are just a test.
After the internal test, there will be another week of bug fixing and parameter adjustments, and then it will enter the public beta for half a month.
The number of places for the public beta test will be expanded 100 times to 30,000!
At that time, Youzi Technology's current servers will face their first actual stress test.
"Boss, when will the computing center be completed? The almanac says May 9th is suitable for opening registration, but it's already March 18th. You need to give us at least two weeks to migrate the data and integrate the system. Otherwise, we won't be able to open registration by then. Don't blame me."
Hua Zecheng sat next to Fang Yu, looking at the tablet with worry.
Three hundred testing qualifications do not really put much burden on the computing power of Youzi Technology, but there will be 30,000 next week.
Hua Zecheng has calculated that under the current computing power conditions, it can only handle a maximum of more than 35,000 concurrent requests, and sufficient computing power must be left for the development team. In this way, the overall computing power redundancy is very low. If there is a device failure during the public beta, the computing power will drop further.
If it really doesn't work, just go to Ali Cloud. It's much cheaper than building your own data center.
Hua Zecheng is not responsible for the overall planning and optimization of the data center and has no idea how powerful the Y series data center is.
Fang Yu felt helpless when he heard Hua Zecheng's complaints.
A few days ago, he asked Youzi to check Nvidia's server to see when the P100 would be shipped.
As a result, after going around, Youzi came back and told him that it would take at least another six months, and this was still its optimistic estimate.
Based on the current work efficiency of Nvidia that it has observed, it is estimated to take a year.
This computing card, which claims to use the HBM2 video memory NVlink new service bus, has not yet been officially taped out and is still in the testing, improvement and deployment stage.
It will take at least this time next year for all of this to be completed.
Isn't this a waste of time?
Old Huang is indeed a big liar!
The computing card will not be shipped until next year, why are you releasing it now?
They also started preheating a month before the release.
Damn it, what a waste of my time.
Therefore, Fang Yu had no choice but to order the M60 worth 25 million dollars first to meet future user requests.
Let’s deal with it for a year first, and then expand the data center after P100 is released.
"It's almost there. The 7,000-yuan M60 will arrive in batches starting next week. Just wait a little longer." Fang Yu made a pie in the sky, then activated the Core of Esseron and asked Yuzi if the revised plan based on the M60 was ready.
He was also very anxious. After all, Youzi Technology was going to appear at IEEE on May 23. After the report of Youzi Technology was finished, it would inevitably enter an era of many competitors emerging in a short period of time. Most companies might indeed apply the framework of the Orange model, but many large companies would definitely still insist on independent research in the same direction.
If they really succeed in doing so, how can we gain control of global artificial intelligence by spreading the underlying principles of the Orange model?
Although this possibility is small, it is not impossible.
Therefore, we must speed up our lead and force them into the arms of the big orange model.
"Of course I'm ready, sir. I've already sent the revised plan back to Hongwan Intelligence." Youzi was very dissatisfied with Fang Yu's lack of trust in its abilities.
Fang Yu's capitalist nature was clearly exposed: "Why release it so early? Why don't you optimize it again? A 1% increase in system efficiency means a cost savings of 8 million. Withdraw the plan and come up with a new one. It must improve by at least 5%."
Yuzu choked a little, she was careless.
I forgot how much of a dog this owner is.
Aren't you just finding a job for yourself?
It would be great if I could watch a few more episodes of Legend of the Red Shadow in Classic of Mountains and Seas during this time.
Nazha is so beautiful, as beautiful as Repa.
Love it.
It’s just that Xinyue Fox is too pretentious, even prettier than the dog owner.
"Master, I can't do it, I really can't do it." Youzi cried and wailed.
"Under the current Pomelo architecture, only about 11% of the M60's computing power can be applied to the Orange large model. This performance is only achieved after I modified the core instructions. Otherwise, the utilization rate would not even be 8%."
Only 11% of the hashrate can be used? How can that be so low? The load seems quite high.
"If you don't believe me, take a look, Master. This is the analysis I did before." Yuzu quickly threw a page of report over through the Core of Esseron.
"High load doesn't mean high utilization. A large number of computing units in the M60 are not needed or can't be used by the yuzu architecture. I have already maximized the M60's suitability for the yuzu architecture by rewriting the core instructions. There is no way to increase it any further."
Fang Yu took a closer look and found that it was true.
After all, Nvidia is a graphics card company, and the computing cards it makes still integrate a large amount of graphics processing functions.
Texture units, rasterization units, geometry processing units, render output units, hybrid anti-aliasing units...all of these units have been retained.
However, most of the functions of these units are not required by the yuzu framework.
Nvidia is really weird. I want your M60 just to do simple calculations. Why do you give me so many graphics card functions?
Who uses M60 to play games?
"That's not the case. Although these units are not needed in the Yuzu framework, they are needed in many other computing models, such as the GaNs adversarial network. When generating images in the adversarial process, if there are texture units, the generation speed will be faster."
"I can only push the utilization rate to 11%, which is the limit. Even if Nvidia engineers debug it themselves, it can only go above 9.1%."
"There's no other way. After all, Nvidia's chips aren't specifically designed for the Yuzu framework, so they have to be compatible with all models."
Yuzu seizes every opportunity to show off his achievements.
Fang Yu nodded and was about to say something, but when he heard Youzi's last sentence, he suddenly felt like he had missed something.
"What did you say just now?" Fang Yu asked Youzi anxiously.
Yuzu said in a confused tone: "I said Nvidia's chips must be applicable to all models."
"Not this one, the previous one!"
"Isn't Nvidia's chip specially prepared for the Yuzu framework?" Yuzu asked cautiously.
For some unknown reason, it felt a little uneasy.
Why do I always feel like I'm digging a hole for myself?
"Yes! That's it!" Fang Yu clapped his hands suddenly, startling Hua Zecheng who was still looking at the pad next to him.
"It's okay, it's okay. I just remembered something important." Fang Yu smiled and patted Hua Zecheng's thigh, continuing to communicate with Youzi in his mind.
"Yuzu, collect all the chip technology data from Nvidia, AMD, Intel, AMSL, TSMC, ARM, and Qualcomm and eat them all!"
Fang Yu gave Yuzi an order through the Eselan Core without hesitation.
"Ah?" Youzi was stunned. How long would it take to finish eating this?
Even if my clone can now hack into the internal servers of these companies, if I want to copy these top-secret information without leaving a trace, I have to move around bit by bit like ants moving house.
"This is just the first step." Fang Yu ignored Yuzi who was trying hard to make a crying face in the Core of Eselon and continued to give orders.
"After consuming their data, I'll combine their technologies, refine and optimize them, and design a computing chip that's only suitable for the Pomelo Frame and the Orange Large Model!"
In the living room of Hanning Mansion, Youzi looked at Zhang Han on TV and suddenly felt that his face became even more hateful.
"Master, in that case, will M60 cancel the order or not?" Yuzi had already mastered the art of indirect communication. "If I cancel the order, I'll lose my deposit."
Fang Yu smiled slightly. "No, why cancel the order? I didn't say we have to make the chip right now. You should design the chip first."
Software + hardware, a two-pronged approach, it seems that the Yuzu architecture is destined to dominate the market!
In the development of artificial intelligence in the past decade, the two most important nodes were actually led by Google.
The first node is undoubtedly DeepMind's AlphaGo, and the second node is the shocking paper "Attention is all you need" published by Google Brain in June 2017.
In this paper, eight researchers at Google Brain first proposed the potential of multi-head attention mechanisms for NPL. At the time, the original Transformer model had a mere 100M capacity. This model completely abandoned recurrent neural networks (RNNs) and convolutional neural networks (CNNs), replacing them with a completely different attention mechanism and encoder-decoder architecture.
It is worth noting that Ilya from openAI is not Ilya Polosukin, one of the authors of this article.
After the article was published on June 12, 2017, it did not immediately cause a significant impact. Moreover, due to its difficulty in convergence and its low efficiency compared to the relatively mature LSTM, most researchers, including OpenAI, did not focus on the transformer architecture with attention mechanism at this stage.
At the beginning of 2018, openAI was still using LSTM for training and defeated humans in Dota 2. Just a few months later, openAI released GPT-1.
This shows that a few months is enough time to make a large model.
(End of this chapter)
Continue read on readnovelmtl.com