What would you do if you learned true magic?
Fang Yu, a sophomore in college, never expected that he would actually acquire a mage tower!
This is an urban magic "face-slapping" ...
Chapter 316 Y Search Out to Sea (6143)
Is this data scraped in real time? How is that possible?
How could Youzi Technology have such a large data center and bandwidth?
Not to mention Youzi Technology, which only received 1 billion Malaysian dollars in investment, even for Rice, whose cash flow has basically returned to positive, it would be a fantasy to invest in a search engine!
"Real-time crawling? Does Yuzi Technology have enough bandwidth and servers?"
Lei Jun couldn't figure out how Yuzi Technology achieved this Y-search.
The development of search engines today, whether it is Robert Lee's hyperlink technology or Google's PageRank technology, is essentially to start from one or more well-known websites through web crawlers, and continuously crawl web pages through various web links and read web page content.
The captured web page content is not used directly for search, but is analyzed and key information in the page, such as text content, title, keywords, links, etc., is extracted and stored in the search engine's index library.
This index library is like a directory of Internet content, helping search engines quickly find relevant pages when users initiate queries.
The difference between Robert Lee's hyperlink technology and PageRank is that Robert Lee solves the problem of crawling methods, while PageRank solves the problem of empowering web pages.
Two web pages with the same content, a page from the White House and a personal page from a child in Africa, will obviously have different weights.
Google's PageRank algorithm weights these web pages and calculates which ones are more valuable, making them easier to find.
These two technologies are also the most basic technologies of today's search engines. Almost all search engines are built on these two technologies.
But this brings up a problem.
Bandwidth and enormous database issues.
Bandwidth determines the search engine's crawling speed and user experience speed, while the database determines the accuracy and richness of search results.
Countless new web pages are created on the Internet every second. Where is the database of crawled links stored? How much server space is needed?
Although it only stores links and content indexes, the entire Internet web page volume is too large, and even this small part is not something that any small business can afford.
Google spends as much as $7 billion every year on adding, updating, and maintaining servers, and this amount is increasing every year.
Both Google and Qianxun were entrepreneurs who entered this field in the wild days of the Internet. At the beginning, they did not need to invest too much server resources to crawl all the web links on the Internet.
But it is not like that now. After more than a decade of development, the Internet has become a behemoth, with more than 3.4 billion Internet users, accounting for 45% of the world's population.
If we exclude preschool children who have not yet registered an Internet account and elderly people who have no knowledge of the Internet, the proportion may have exceeded 65%.
You can imagine how huge the amount of data on the Internet is today.
Search engine giants such as Google and Qianxun have grown step by step along with the Internet. Their revenue growth rate is even faster than the growth of the Internet. Naturally, they can continue to increase their investment to add new servers and respond to user needs.
This is also the reason why there are no new entrants in this industry.
This is a completely accumulation-based industry with a very deep moat that is simply not something that ordinary companies can cross.
If you want to overthrow the dominance of Google or Qianxun by relying on product strength indicators such as search experience, content richness, and search accuracy, the only way is for a giant or big boss to invest tens of billions of dollars regardless of returns, crawl the entire Internet content, and use sophisticated algorithms to create a search engine that can compete with Qianxun or Google in terms of product strength.
This is just a test of strength. It is hard to say whether it can really surpass Qianxun and Gugou.
Because of this, based on cost considerations, search engines will not set a uniform crawling frequency for each web page.
The crawler will dynamically adjust the crawling frequency based on the importance of the web page, the update frequency, and the crawling strategy of the website.
Important web pages, such as news websites and search engines' own news centers, may be re-crawled every few minutes, while infrequently updated pages may not be re-crawled until days, weeks or even months.
But the Y search that Lei Jun and Zhou Shuzi just saw not only crawled some web pages that are generally believed not to be crawled frequently, but also the results captured were only a few minutes ago.
For example, there is a self-media article written by Dazui, which was published 5 minutes ago.
Generally speaking, this kind of self-media will be crawled by search engines very rarely. Unless it can be found through vertical searches like searching for Toutiao accounts in Toutiao, it will not be found using Qianxun or Google.
Just like this webpage, due to the problem of crawling frequency, this article cannot be found by searching Qianxun and Google.
But Y search did find it, and the quality of this article is not low.
Could it be that Ysou just happened to crawl this link?
Isn't that too much of a coincidence?
"Y search is not a completely real-time search. It is actually in two directions compared to traditional search engine technology." Fang Yu put out the cigarette in the ashtray.
He was not a heavy smoker and chose to meet Lei Jun and Zhou Shuzi outdoors because Lei Jun was a heavy smoker, smoking two packs a day. Smoking was strictly prohibited indoors in Xinhao, so a coffee shop with an outdoor area was more conducive to discussing things for smokers.
"The search technology used by Ysou is completely different from traditional search technology. Traditional search technology downloads links, then weights and indexes them to build a database."
"YSearch uses a large-scale model to analyze and learn the data connections of the 1.7 billion web pages currently on the internet. It then makes probabilistic judgments about which links are likely to be of higher quality and provides search results based on these probabilities."
“Therefore, YSearch doesn’t need a lot of servers to store the specific data of these web pages. It’s just that the index of these links has been ‘learned’ by the big model. We only need to store the links.” (Note 1)
“When a user searches, the big model automatically provides links that it believes meet the user’s needs based on the user’s intent or its own judgment.”
"As for the question of crawling frequency, it's actually not that difficult. According to real-time data from internetlivestats, there are currently 1.3 billion web pages on the Internet, 50% of which are empty or broken links."
"After removing these, there are only over 600 million links. Of these 600 million links, nearly 400 million are 'inactive websites.'"
"The Orange algorithm makes decisions based on 'data tags'. If the already crawled 'data tags' haven't changed, the crawl will not be repeated. Only after the 'data tags' have changed will the Orange model proactively crawl the updated webpage to ensure its own data is up to date, and then create a new 'data tag'."
"The advantage of this technology is that we don't need to build as many large data centers as Qianxun and Google."
"A single-story data center covering 20,000 square meters should be sufficient to meet the search needs of all users in Da Zhou. The investment is probably less than one percent of Google's. Currently, YSearch uses Ali Cloud."
"Of course, if we want to develop other businesses, such as the current cloud storage, encyclopedia, document library, map, email and other functions of Qianxun and Gugou, we still need a large data center to support them."
"Another benefit of this technology is that it is very easy to review and filter. When the review and filtering rules are determined, Y search can more accurately filter the information that needs to be reviewed to avoid accidental damage."
"In the AI era, uncontaminated data is extremely important, but the current Zhouwen data on the Dazhou Internet is too polluted, and the effect of training large models is very poor."
"A considerable portion of this is due to mistaken errors during review, which results in poor trainability of Zhouwen's data. Therefore, under the Y-search algorithm, we can accurately identify the search results that need to be filtered, reducing 97.98% of the data mistaken errors."
"Although this won't produce any significant results in the short term, over time it will have considerable benefits for the entire Great Zhou's internet data resources."
"The bandwidth required is not much different from Qianxun's current bandwidth requirements. After all, both data transmission and return require bandwidth, but this part of the cost itself does not account for a large proportion for search engines."
"The biggest difficulty with this technology is that the changes in most web pages are difficult to accurately predict, and a reliable crawling strategy is needed to keep the data up to date and ensure the accuracy of the links and generated index."
"But fortunately, we have made some breakthroughs in this area. Of course, the specific algorithms are confidential, so I won't share them with you two."
"Because of the cost savings in various aspects, I can maintain the normal operation of this search engine even if Ysou does not go public."
Lei Jun looked at Fang Yu's phone screen as if he were looking at an alien. "You mean, YSearch is a big model disguised as a search engine?"
In just a few months, AI has revolutionized the search engine industry?
What kind of evolution speed is this!?
Is it possible to do it?
If this is true, which industry will be the next to be disrupted?
Lei Jun suddenly felt somewhat fortunate that his Xiaomi chose to start a hardware business and could become a carrier of AI.
If you had chosen to enter the field of mobile Internet software innovation, you would probably be worried and unable to sleep now, right?
Fang Yu immediately corrected Lei Jun: "No, it can only be considered a search engine integrated with AI functions."
Too much is as bad as too little. Integrating AI into search engines is one thing, but making the search engine itself a large AI model is another.
Currently, most people are still at the stage where they know about AI but have not yet experienced it firsthand.
At this time, if they find that the operating logic of the search function they use daily has fundamentally changed, they will inevitably become wary of AI.
By then, you never know what might happen.
Fang Yu said earnestly, "This involves technical information that hasn't been made public yet. I'm telling Mr. Lei this because I trust him, he's not the kind of person who likes gossip. I hope he can keep it confidential for me."
Lei Jun smiled bitterly. He now really believed that Fang Yu really didn't want to let Ysou go public.
Under this model, the threshold for operating a search engine with full network coverage is greatly lowered. Even a startup company that has just entered the unicorn stage, such as Youzi Technology, can enter this field.
No, it cannot be considered as being lowered. Being able to build and pre-train such a large model is itself a threshold.
Especially the algorithms Fang Yu mentioned are feasible in theory, but only in theory.
If these algorithms were so easy to make, what would be the point of Qianxun and Google? These two companies would have been overthrown long ago.
But it was actually developed by a small company like Youzi Technology!
Turning to look at Zhou Shuzi again, Lei Jun saw an eagerness and anticipation in his little brother's eyes that he had never seen before.
Lei Jun sighed in his heart, but he did not blame Zhou Shuzi.
It is impossible for anyone not to be tempted by this vision that completely subverts the future.
"Xiao Fang, if that's the case, then Shu Zi doesn't have to go there, right? If you don't go public, Shu Zi will be useless there. Qianxun and Gugou should have many more suitable talents."
Silently, Lei Jun changed the way he addressed Fang Yu and touched his pocket.
"By the way, I heard that Lu Qi from Pseudosoft has resigned now, and Qianxun is trying to contact him. If you contact him now, he should be very interested."
"Qianxun's Yuan Shanjun and Liu Anlin are also said to be looking for opportunities outside the company. They are more familiar with the search engine business and were instrumental in Qianxun's commercialization."
Yuan Shanjun? Liu Anlin? I forced these two out to find jobs. How could I possibly hire them?
Qianxun's technical staff is pretty good, but the management? Haha, forget it. If the top beam is not straight, the bottom beam will be crooked. The road has already gone astray.
As for Lu Qi...
The operators of Pseudo-Soft Da Zhou love going to nightclubs and messing around with female colleagues, just like in the financial circle.
Although Lu Qi has been at the Pseudo-Soft headquarters, if he comes, there is no guarantee that he will not recruit a few executives from Pseudo-Soft Zhou.
The arrival of a few executives who like to have affairs with female colleagues and go to nightclubs has set a bad tone.
I said that Chihiro's upper beam is crooked and the lower beams are crooked, so I hope that in the end Ysou's style will be worse than Chihiro's.
Fang Yu is very dissatisfied with many professional managers of foreign companies.
These people claim to have an international perspective, but in fact they can only talk big, work in a tightrope, and maneuver under the established structure of the company. They rely on platform resources to do well and think that it is their own ability.
In fact, it's bullshit.
For a period of time, Fang Daqiang poached a lot of professional managers from several foreign companies. The salaries they offered were basically double what they earned in foreign companies, and some were tripled, and he also gave them sufficient power.
As a result, after this group of people arrived, they immediately started to form circles, exclude dissidents, and then started to make money.
It’s not that foreign companies don’t have strong people. These people’s basic qualities and abilities are definitely much stronger than many professional managers in private companies, but that doesn’t mean they can use these abilities in your company.
"If you think Qianxun's people are not good enough, you can also look for someone from Google. Philip Schneider of Google is very good at operations management. I met him before in Hamburg, Prussia."
Lei Jun looks like a otaku, but he is actually very good at observing people's words and expressions. He vaguely saw that Fang Yu was not interested in these two people, so he began to recommend the vice president of Google.
Fang Yu smiled and handed Lei Jun another cigarette. "Boss Lei, we don't recruit non-Zhou people for this position at Ysou, but we're not looking for Zhou people with a Great Zhou background either."
"To be honest, besides his outstanding abilities, Brother Shuzi's background is also a major reason why I wanted him to come to Ysou. Brother Shuzi, I want to speak frankly, and please forgive me if I have offended you."
After saying that, Fang Yu smiled apologetically at Zhou Shuzi.
Zhou Shuzi was a little confused.
Background? What background do I have? My wife does have some background, but it doesn't have anything to do with IT.
An idea suddenly flashed in Lei Jun's mind: "You want to go out to sea!?"
Fang Yu snapped his fingers and chuckled, "Bingo! As expected of Mr. Lei."
Lei Jun held the cigarette between two fingers and waved it. When the ash fell on his pants, he quickly brushed it off with his hand.
"No wonder you bought the 'why' domain name after acquiring the 'Y' domain name. It turns out you're targeting the international market."
Lei Jun sighed.
"If we're talking about going overseas, Shuzi is indeed a good choice. His Lijiapo background is indeed suitable for developing the Southeast Asian and Bharat subcontinent markets."
Fang Yu smiled noncommittally and looked at Zhou Shuzi: "Brother Shuzi, what do you think? Are you interested? At your level, I don't need to discuss any salary issues with you. If Mr. Lei can afford it, so can I."
Zhou Shuzi was obviously very tempted. This was a much more attractive job than operating a rice IPO!
If the rice is fully developed, the market value on the day of listing will be around 100 billion yuan.
Moreover, as Sansang stopped supplying rice, this year's rice production capacity issues and Mi 5's product strength issues will definitely cause rice sales to decline, and it is hard to say what the valuation will be at that time.
But precisely because of this, it would be a bit unfair to leave rice behind now.
If Mr. Lei disagrees and holds a grudge, it will be detrimental to his reputation.
Zhou Shuzi's eyes flashed and he looked at Lei Jun.
At the same time, Fang Yu also looked at Lei Jun, who was supporting his chin with his wrist.
"Mr. Lei, an IPO is indeed very important for Dami, but this job is not something that only Brother Shuzi can do."
"As long as Dami can be profitable and show signs of brand enhancement and momentum to become the fourth pole in the mobile phone industry, there are plenty of professionals who can make it happen."
"I've said before that Mr. Lei is an entrepreneur and start-up founder I've always admired. I didn't want any hardship to arise between our collaboration, so I didn't communicate with Brother Shuzi in advance. This has made Brother Shuzi a little embarrassed, and Mr. Lei is also a little embarrassed."
"How about this, Mr. Lei? I can make you a promise. If Youzi Technology collaborates with any other mobile phone brand on AI systematization in the future, the price I'll give them will be 30%-50% higher than yours. We can sign a minimum price agreement, valid for five years."
!!!
Lei Jun's body trembled, and he wanted to say something but didn't.
Fang Yu smiled knowingly: "Boss Lei, you and Brother Shuzi can discuss it. I'll go back today. Brother Shuzi, if you have thought it over, give me a call. I'll go and pay the bill first."
Fang Yu picked up his phone, stood up, turned around and was about to pay the bill, but suddenly he remembered something and slapped his forehead.
"Mr. Lei, have you decided on a spokesperson for the Mi Mix and Note 2 you're releasing in October? Could you please give me a favor?"
As Da Mi's core partner, Fang Yu certainly knows Da Mi's product planning for the second half of the year.
Lei Jun was stunned. This was a matter for the brand strategy department. He had just listened to Li Wanqiang's report and had some impression of it.
"Note2 is mainly for business, and we're in contact with Liang Chaowei. It seems they're looking for that guy for Mix, the one who just came back from Korea, he's quite handsome, Wu..."
"Mei Yeping." Zhou Shuzi reminded from the side.
Lei Jun patted his forehead and said self-deprecatingly, "Look at my memory. Yes, that's right, it's him. He said that he has a lot of traffic now, young people like him very much, and he can help with the black technology settings of Mix."
Mei Yeping?
"Boss Lei, can you give the Mix to Yang Mi? And the Note2 to Repa?"
When helping Da Mimi to talk, don’t forget Re Pa, you have to treat everyone equally.
Fang Yu didn't say anything like: It's okay if we can't change the spokesperson, I'm just asking for help, these kind of nonsense.
For people of Fang Yu and Lei Jun's level, this kind of thing is not important at all, it's just a matter of a word.
It just depends on whether you are willing to say this.
Moreover, for Dami, it doesn’t matter who is chosen as the spokesperson.
Those who buy rice are looking for value for money or fans. To put it bluntly, the basic group is losers. Who among those who chase stars would buy rice?
I don't know who chose Mei Yeping. All the people who like him are women. If you choose him as your spokesperson, women will probably not buy your phone.
Your base of fans is young male losers. Choosing a beautiful woman as your spokesperson can at least make it pleasing to the eyes of users.
Choose Mei Yeping. Few men don't hate him, and the number of people losing their core base is greater than the traffic he brings.
It would be great to choose Da Mimi, the main feature of your mix is Yamato's black technology.
There is nothing wrong with the size of Da Mimi, and there are a lot of black technologies on her face, which is in line with the brand tone.
Sure enough, Lei Jun didn't take it seriously: "There's nothing wrong with the Mix. The contract probably hasn't been signed yet. But doesn't this hot product you're talking about not match the business tone of the Note2?"
What business-oriented tone does the Note2 have? Who would use it for business purposes now?
Isn't this just like throwing a coquettish eye at a blind man?
Besides, Liang Chaowei has no appeal among male customers, and men don’t think he has much business sense.
I guess it was done by a female fan from the brand department.
If you really want to focus on business, you might as well find a few bosses who have bought your phone as your endorsements. Although Dami doesn't have a business tone right now, with such a large user base, it's easy to find a few senior professional managers or private company owners as fans.
If that doesn't work, you can get a few of your big friends to be your spokespeople.
He Xiaopeng, the former boss of UC who is planning to build a car, Da Qiangzi, the husband of milk tea, Chen Nian, the boss of Fanke, and yourself, several big bosses are holding Note2, showing their side faces, with backlight. As the light moves, the camera follows, until the lens focuses on the bosses' pretentious postures and the Note2 in their hands.
The voiceover is a rich baritone, "Life is about pushing your limits again and again. Xiaomi Note2, break through your limits and achieve yourself!"
Then, from time to time, I would take some street or daily photos of my bosses using Note2 to generate some buzz.
Isn’t this better than hiring Liang Chaowei?
Lei Jun pondered for a moment and said, "How about this? Redmi has three spokespersons: Wu Xiubo, Liu Shishi, and a young man who's become quite famous recently. I'll replace one of them with this hot guy you mentioned."
Fang Yu smiled and said, "Thank you, Mr. Lei."
Note 1: The data we learn is based on web page metadata, not web page content, so it does not contradict the data scarcity problem in the data crisis mentioned in the previous chapters.
To put it simply, using a book as an example, the title of the book is stored in the server, and then the big model learns the table of contents and at most a summary.
This technical idea is my original idea. I checked the papers and found no relevant papers.
(End of this chapter)