Chapter 116 Joining the Group (First Update)
"How do you handle duplicate values in your data?"
Cheng Feng pushed his glasses, secretly looked up at Fang Yu, and asked hesitantly.
Is this a gym student? Is this a gym student? Is this a gym student?
Have all the sports students started modeling?
As soon as Fang Yu walked in, his height of nearly 1.9 meters and his sturdy figure gave Cheng Feng a lot of visual pressure.
After Fang Yu sat down, Cheng Feng felt a subtle sense of pressure from Fang Yu, which made him feel like he was facing a mentor.
Fang Yu just thought Cheng Feng was a nerdy senior and didn't pay much attention to his abnormality.
What Fang Yu didn't know was that the continuous improvement of the essence of life made ordinary people feel some fear when facing Fang Yu, just like lower-level creatures facing higher-level creatures.
Now Fang Yu is only a second-level mage. When he reaches a higher level, this situation will become more obvious.
Many arcane mages who do not want to give up normal human social life will choose to use props like the "Gentle Power" bracelet or arcane seals to suppress their superior aura.
Hearing Cheng Feng's question, Fang Yu showed a thoughtful expression: "Use the duplicated method of pandas to return a Boolean Series, which can identify duplicate values, and then use the drop function or keep function to delete them."
If features are repeated, the feature similarity is calculated using the corr method, with the method parameter specifying the Kendall or Spearman correlation coefficient.
Hearing Fang Yu's answer, Cheng Feng was stunned. This was completely beyond his expectations.
Although Cheng Feng only asked some basic questions, he was completely surprised to get such a clear answer.
Cheng Feng still didn't dare look at Fang Yu, staring at the screen. "How did you identify the outliers? I saw you used the Winsorizing method to adjust the outliers. Why did you use Winsorizing instead of replacing or removing the mode?"
Fang Yu thought for a moment and said, "For numerical data, we can use box plots and histograms. We can also use the descriptive information generated by the describe function. For categorical data, we can use bar charts. For some normally distributed data, we can use the 3σ criterion."
“As for processing, because deleting outliers will significantly reduce the number of samples, and I don’t know whether the subsequent algorithm is sensitive to outliers, replacing it with the mode may affect the results if it masks the variability of the data, so I used the Winsorizing method to adjust the outliers.”
Cheng Feng looked at Fang Yu and didn't say anything for a long time.
At least when he was in his sophomore year, he definitely didn't have this level.
Not to mention Fang Yu's finance major, even if he were a math major, he'd only just started learning some basic data structures and programming by his sophomore year. At best, he'd only just started learning about data cleaning and manipulation.
Many people don’t understand how to evaluate outliers until their senior year, or even when they join a graduate school group. They may end up mistaking normal values for outliers.
Although Fang Yu's answers to these two basic questions were concise, it was obvious that he had mastered the skills of data cleaning.
More importantly, he was able to not only distinguish different methods for dealing with duplicate values and outliers, but also discuss the applicable scenarios of different statistical methods and provide specific code implementations.
This is not something that ordinary students can master, unless they have been deeply involved in some data modeling projects.
Are you kidding me? Why would a finance major like you need to be so good at applied math? Is it necessary?
Aren't you supposed to be out there competing for resources after graduation? Why are you taking jobs from us small-town test-takers?
Also, how did you, a finance major, get so good at math? Don't you take any specialized courses?
Didn’t you know that economics and financial mathematics are the reserved domains of mathematics and physics majors?
Look at these professors in the school, which one of them didn’t study mathematics and physics in undergraduate studies?
As a student from the School of Economics, why are you joining in the economics craze?
Is this the difference between a true genius and a small-town test-taker?
No, the biggest difference between you two is that Fang Yu has a cheat.
"Senior?" Fang Yu called Cheng Feng in confusion.
In the answer just now, most of the technical answers were conveyed by Yuzu through the Core of Eselon, but Fang Yu still added some opinions.
"Junior Fang Yu, I didn't..." Cheng Feng had just said half of his sentence when he heard a voice coming from the laboratory door.
"If you use a linear regression model later, how do you plan to deal with the outliers and feature similarities in this set of data?" Fang Yu turned around and saw Tong Yongshan walking in from the door.
Behind Tong Yongshan was a young woman of about 26 or 27 years old wearing a cheongsam.
The woman was not very pretty. If Fang Yu were to give her a score, she would only give her 70 points at most for her appearance.
Her figure is pretty good, about 80 points.
But this style is at the level of 90 points.
"Teacher! Senior Sister."
"Hello, Dean."
Cheng Feng quickly stood up from his chair and greeted his instructor.
Fang Yu also stood up and politely greeted his dean, then nodded to the young woman whom Cheng Feng called "senior sister".
The cheongsam woman pursed her lips and smiled gently, her eyes were charming and charming. She did not introduce herself to Fang Yu. She walked to the tea room with a graceful swaying waist and started making coffee.
"You're welcome, just answer the questions." Tong Yongshan moved a chair and sat opposite Fang Yu, flipping through a stack of documents printed out by Cheng Feng.
Fang Yu sat down calmly and thought for a moment. "In linear regression analysis, outliers can significantly affect the regression coefficient and the accuracy of the prediction. Therefore, the first thing to do is to accurately identify outliers."
“I might use diagnostic plots, such as residual plots or influence diagrams, to identify these outliers. Once the outliers are identified, I would prefer to use robust regression techniques to reduce the influence of these points.”
"For example, using LAD regression or performing transformations, such as logarithmic transformations, can stabilize the variance of the data and improve the overall performance of the model."
"As for the feature similarity issue you just mentioned, highly correlated explanatory variables may lead to multicollinearity, which is very important for linear regression models. Therefore, how to accurately evaluate the similarity between features is the most important issue."
"In this case, I prefer to use VIF to assess the interaction between variables."
"I believe that exploratory factor analysis or principal component analysis can reduce the dimensionality of the data without losing too much information. If used properly, it may effectively reveal the structural connections hidden behind the data, thereby optimizing the predictive ability and explanatory power of the model."
"Finally, in terms of feature similarity, looking at future trends, I personally believe that we should not only focus on the traditional correlation coefficient, but also consider the cointegration properties of time series data or the causal relationship between variables."
"Therefore, using machine learning techniques such as artificial neural networks to reveal complex nonlinear relationships between variables may be the most important future development direction."
"Dean, I have finished answering." Fang Yu looked straight at Tong Yongshan with a calm expression.
Hearing Fang Yu's answer, Cheng Feng couldn't help but take a breath.
If Fang Yu only demonstrated his skill proficiency and project experience when answering his questions before, then Fang Yu's answer to Tong Yongshan's question now completely surpassed the academic level of an average graduate student.
Most master's students are still at the stage of learning and application. As long as they can skillfully use data processing tools, they are already qualified scientific researchers.
Fang Yu's answer went far beyond this. It not only demonstrated a deep understanding of complex data analysis theory, but also demonstrated considerable original research capabilities and the ability to apply technology to solve a wider range of problems.
Could it be that the true strength of this sophomore student is already that of a doctoral candidate?
It's so terrifying!
Tong Yongshan couldn't help but show obvious admiration and even clapped twice.
He was not intimidated by Fang Yu's professional ability. After many years in Miami, whether it was Columbia University, Pennsylvania University or MIT, they all gathered the world's top mathematical and scientific talents. There were many 16-year-old children whose professional abilities surpassed those of doctoral supervisors.
What really surprised him was that Fang Yu actually dared to make a clear prediction about the direction of academic professional development!
If Tong Yongshan had not revealed his unsubmitted research proposal to anyone, he would even suspect that Fang Yu had peeked at his research plan!
A graduate student like Cheng Feng might not be able to hear anything, but Tong Yongshan was different. The last sentence Fang Yu just said made his scalp tingle.
Yes, a considerable part of what Fang Yu just said is exactly the next research direction he has been preparing for nearly three months!
Just like Boya meeting Ziqi, as a pure scholar, there is nothing more exciting than meeting a soulmate.
"Fang Yu, sophomore, Finance Class 2. I never thought there would be such a student in our college. Good! Good! Good!" Tong Yongshan took a look at Fang Yu's information and applauded.
Tong Yongshan has been not good at speaking since he was a child. The first time he could say three good words in a row to a student was 10 years ago when he recruited Lin Fangdong as his disciple at the University of Pennsylvania.
Lin Fangdong has now become a hot supernova in the economics world and has entered a period of rapid development. Last year, he published three articles in the top five (top five journals) and is on the verge of becoming another leader in the academic circle.
"Teacher, coffee." The senior sister who scored 90 points for charm and 70 points for appearance just now handed Tong Yongshan a cup of steaming hot coffee, and her eyes moved around Fang Yu without expression.
"Nan Zhen, come and meet your junior brother Fang Yu. You will be working together from now on." Tong Yongshan laughed heartily and turned to Fang Yu and said, "Your senior sister Jiang Nan Zhen is also a new doctoral student I recruited after returning to the country. You can communicate more in the future."
"Teacher, my junior brother Fang Yu hasn't agreed to join the group yet." The girl called Nan Zhen chuckled, and the corners of her eyes turned up slightly when she laughed.
Tong Yongshan slapped his forehead, but didn't care too much.
In his opinion, not to mention a sophomore, even a PhD student on campus, even if he has joined other groups, would not be able to refuse such an opportunity.
Fang Yu hesitated for a moment: "Dean, can I ask if there are any attendance requirements in our group?"
Tong Yongshan was stunned. He didn't expect Fang Yu to ask such a question.
Jiang Nanzhen's eyes flickered, and she smiled faintly, "Junior Fang Yu, in the teacher's group, you have quite a bit of free time, but you still have to attend group meetings on time. If you have something else to do, you can ask for leave, but you still have to do your work well. In fact, the workload in the group is very heavy, and even if there is no attendance requirement, you may not have much time to rest."
Fang Yu breathed a sigh of relief. If it was just attending a group meeting and there were no specific attendance requirements, then it would be easy.
As for the heavy workload?
Isn't there a grapefruit there?
It's just right to find something for this ball to do.
I don’t know why, but now I feel uncomfortable when I see it idle.
"That's no problem. Thank you, Dean. I can join the group anytime."
Fang Yu patted his chest, making his chest muscles pop.
(End of this chapter)
Continue read on readnovelmtl.com