Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save kellatirupathi/fbd366214d8a903f8e373f2fd5785f7f to your computer and use it in GitHub Desktop.

Select an option

Save kellatirupathi/fbd366214d8a903f8e373f2fd5785f7f to your computer and use it in GitHub Desktop.
Transcript for transcript_1RqFAT_dyxAfNTZGHwc03Qi_xp2hN26ey_1960_3580_cbb0530277ef_4.txt
[00:00:05] Please tell the candidate to keep the video on.
[00:00:10] Yeah.
[00:01:00] Good evening, Ashwant.
[00:01:03] Good evening, ma'am.
[00:01:04] Yeah.
[00:01:05] See, Ashwant Vaishnavi is a technical head.
[00:01:07] She will take over the technical interview.
[00:01:10] Okay.
[00:01:11] Okay, ma'am.
[00:01:13] Hi, Ashwant.
[00:01:15] Hi, ma'am.
[00:01:17] Yeah, tell me about yourself.
[00:01:19] So, good evening, ma'am.
[00:01:22] I am Ashwant and I am currently pursuing a beta cat, the nickel and shot of engineering and technology in the stream of computer science and engineering.
[00:01:32] And my interest in AML and full stack web development.
[00:01:39] I recently completed hospitality management for hotels via Telegram bots.
[00:01:47] So the tech stack is React.js, Node Express and Telegram bot API.
[00:01:56] And I recently got experience from my previous internship in forces.
[00:02:04] I learned more about practical implementation about Python and React and Express.
[00:02:12] There I built a cyber threat agent where threats are automatically detected.
[00:02:18] And my course lies in the problem solving and the rapid web, making the rapid web architecture.
[00:02:32] That's it.
[00:02:54] Can you hear me?
[00:02:54] Sorry, I was on mute.
[00:02:55] Oh, yes, ma'am.
[00:02:57] You mentioned you have used the Telegram bot, right?
[00:03:00] Yes, ma'am.
[00:03:01] How have you incorporated?
[00:03:03] Can you explain?
[00:03:05] The working of this Telegram bot, ma'am?
[00:03:07] Yes.
[00:03:09] So generally we will be creating a bot in the normal telegram account and we will be getting an API key to access the bot messages.
[00:03:23] So whenever a user sends a message to our bot.
[00:03:32] It will be sending to the telegram API hook and there we will be calling our backend webhook.
[00:03:43] So we will be using the chart ID as well as the message sent by the user.
[00:03:48] And if it is the message is like a strings like get info or else some similar to that, we will be replying it back.
[00:03:57] If it is something like a normal conversation, like normal a support request or some
[00:04:03] other thing we'll be using offline LLMs or else online LLMs to process the information and react with it.
[00:04:14] Okay.
[00:04:16] So you have used rags for the project?
[00:04:24] Rag, any vector database you have used?
[00:04:28] No, ma'am.
[00:04:30] Okay.
[00:04:31] Can you explain this machine learning project that you have done with Infosys?
[00:04:37] Yes, ma'am.
[00:04:38] It is the project I use like cyber threat.
[00:04:42] The project title is like cyber threat agent so its main aim is to analyze or else find the threats over the internet and over the internet such as like the
[00:05:00] So such as like problems like the continuous request or else some
[00:05:10] queries at a query type attacks etc so what does this the model the data set i have chosen for this is like csc ideas so which is robust which is containing at most 10 billion of
[00:05:25] 10 billion of parameters
[00:05:28] So how it works, I will explain the working.
[00:05:31] So how it works is when a user access the website, so we will be catching the meantime of the network packets, like
[00:05:44] the data which will be storing the network or all those packets.
[00:05:48] So we will be analyzing these packets.
[00:05:50] If the user is in a stable condition and not in the error zone or else our model is not detecting, so it will let the user to continue browsing.
[00:06:01] If the user is, the model has detected the user as
[00:06:07] a user as a like an flags as an attack so the model will generate a firewall rule to protect the server from that user so the
[00:06:22] user will be blocked for a few minutes like five to ten minutes
[00:06:27] So that's how it works.
[00:06:31] Okay, so now, so this is a classification problem, is it?
[00:06:36] Yes, ma'am.
[00:06:38] So you detect whether there is a threat or no threat?
[00:06:41] Yes, ma'am.
[00:06:42] And the attack to...
[00:06:46] Implement the firewall timing and etc.
[00:06:50] How are you saying the accuracy is 98%?
[00:06:55] Like we trained on the CSC IDS dataset.
[00:06:59] What is the dataset?
[00:07:00] Come again?
[00:07:01] CSC IDS and 2018 IDS 2018 dataset.
[00:07:06] Okay.
[00:07:08] So we have trained on this data set and tested it on other data sets like UNBC data set as well as the
[00:07:22] I forgot the other data set.
[00:07:24] So we tested on this and we got an accurate result of 96 to 98.
[00:07:30] So this is the test result you are talking about the 98 percentages.
[00:07:35] Yes, not the real result.
[00:07:39] Not the real result like we haven't tested in real.
[00:07:45] Not in the real world, but with these data sets, test data is what you are doing.
[00:07:50] Yes, ma'am.
[00:07:52] Okay.
[00:07:54] Which model you have used?
[00:07:56] The model we got the highest accuracy is XGBoost.
[00:08:02] XGBoost.
[00:08:04] How does XGBoost work?
[00:08:08] It is similar to random for like the, it will be building like the, in the trees.
[00:08:27] In the tree structure for the classification, like it will be using several number of trees to make the decision about the
[00:08:40] like it will be choosing it will be using several types of trees to make the final decision based on the maximum voting similar to random forest.
[00:08:55] Okay.
[00:08:56] There is one difference between random forest and XGBoost.
[00:09:00] Can you tell me what it is?
[00:09:04] There is one difference between random forest and XGBoost.
[00:09:09] Can you tell me what it is?
[00:09:24] No, I forward.
[00:09:27] Okay.
[00:09:29] Yeah, so random forest is a bagging based model and XGBoost is boosting model.
[00:09:35] So random forest will be parallel and I think XGBoost will be sequential.
[00:09:41] You can verify that later.
[00:09:42] Okay.
[00:09:43] Okay.
[00:09:46] Okay, so the precision, so you have measured accuracy, right?
[00:09:51] Yes, ma'am.
[00:09:52] So what will you do if your training data is biased?
[00:09:59] As in?
[00:10:00] You have, since it's a classification model, if you have a training data of,
[00:10:07] say, only 10% of the entire data set is of one classification and the rest is of another classification, how would you handle that situation?
[00:10:23] Yes, ma'am.
[00:10:23] So the question is like, let's consider the data set is like 100% of classification.
[00:10:30] So 5% of classification is very little.
[00:10:34] Let's say it is.
[00:10:35] Yeah, class A is only 5% and class B is like 95%.
[00:10:40] Yes, ma'am.
[00:10:40] So we will be using synthetic data like...
[00:10:44] We will be using that 5% data to create more data, almost similar to that data.
[00:10:52] So this ensures that the model is still learning that type of attack.
[00:10:57] So we will be using synthetic data.
[00:11:01] Data like the we will be improving increasing the data set of five percent to 10 percent or 15 percent until the model get clearly gets the what is the attack and understands that
[00:11:17] okay what kind of pre-processing you know you have done
[00:11:26] what type of pre-processing are normal.
[00:11:29] Okay, first we have, like we have cleaned the data set and we just used this sampling data where we got SQL injection types of attacks, very few in our data set.
[00:11:44] So,
[00:11:46] Similar, we have used the synthetic data and as well as tried the other data sets SQL injection data types.
[00:11:54] So, more of it before the model training, we almost did cleaning and increasing the data set.
[00:12:07] Values and ensures all the things in the carat percentage.
[00:12:15] Okay.
[00:12:16] You know what is normalization?
[00:12:20] Normalize.
[00:12:31] Oh.
[00:12:35] No, I forgot that.
[00:12:36] Yeah, no problem.
[00:12:38] So what is SQL injection?
[00:12:41] SQL injection is like inserting queries into the input field and making the normal processing of the request into some
[00:12:56] insertion of their own data and etc things like that like uh continue free normal uh it works like when we are
[00:13:06] creating a new document or else something we will be using a query to insert it into the database so when a user passes the query and if we don't
[00:13:21] like if we don't do perfectly secure in the backend.
[00:13:26] So it will be considered as SQL injection.
[00:13:28] So because that SQL query will be updating, will be running in our own database, main database.
[00:13:35] So it affects the database values or else leaking credentials.
[00:13:43] Okay.
[00:13:47] So in a table, so you have worked in RDBMS or?
[00:13:54] In your projects?
[00:13:56] Ma'am?
[00:13:57] Have you worked on relational databases?
[00:14:00] Yes, ma'am.
[00:14:02] Okay, you know what an index is?
[00:14:11] Yes, ma'am.
[00:14:13] What is it?
[00:14:14] So index is used to increase the query performance.
[00:14:18] So when we wanted to find a value in the database, so without using index, we will be searching all the columns and all the rows of values.
[00:14:28] But when we use the index, we will be searching for only that column.
[00:14:32] For the values so we will be using index on ids or else ids email ids for numbers which are unique as well as easy to find
[00:14:46] okay what are the type of index
[00:15:01] No, I don't.
[00:15:04] Yeah, no problem.
[00:15:09] In MongoDB, if you want to search for a, say, a particular user, so what is the keyword?
[00:15:17] How will you find it?
[00:15:20] You have a user table, you have user IDs, and you want to search for one particular user ID.
[00:15:27] We will be normally in the like in the backend or something, we will be using mongoose.find and passing the user ID.
[00:15:37] And if we are using the query operator, like in normal query,
[00:15:45] Similar to it, users.find and column will be in the users.find function, we will be placing the object with user ID and etc.
[00:15:58] Which query ID you have used?
[00:16:03] Which query ID?
[00:16:05] What you have used, the client you have used for MongoDB.
[00:16:11] Okay, so the querying parameter like that?
[00:16:14] No, no, no, like Mongo Compass.
[00:16:16] You know what is a Mongo Compass?
[00:16:18] Yes, Mongo Compass is like database management.
[00:16:21] We can view data and etc.
[00:16:23] Okay, so it used Mongo Compass.
[00:16:26] Yes, ma'am.
[00:16:28] Okay.
[00:16:32] So, NITEN, what have you used NITEN for?
[00:16:40] I have used it for resume as well as
[00:16:46] Yeah, removing the resume condition.
[00:16:49] So I will be uploading my resume format and it will be giving me the points to improve and replace the sentences to improve the ATS score and etc.
[00:17:04] Okay, NI10 is a workflow automation tool, right?
[00:17:08] Yes, ma'am.
[00:17:10] How are you using it for resume?
[00:17:11] So you have so many workflows written on that or how is it?
[00:17:18] No, ma'am.
[00:17:19] So for my resume improving, I did use like the first node is like uploading information and the second node is like extracting the information in the JSON format.
[00:17:31] And the third one would be like the model which have got a charge GPT 100 trade.
[00:17:38] So I did use that and I will be passing this information and telling the model.
[00:17:46] The object of this model is to improve the check the ATS resume score and improve the sentences which affect the ATS score reduce and it will be giving me in the JSON form.
[00:17:59] Next.
[00:18:01] Okay.
[00:18:02] Okay.
[00:18:03] So if you have to store user ID and password.
[00:18:10] Yes, ma'am.
[00:18:10] Okay.
[00:18:11] So there is a website which takes the user ID and password and authenticate and lets the user use the website.
[00:18:19] Okay.
[00:18:20] So how will you make use of the password?
[00:18:23] How will you store it?
[00:18:24] How will you verify?
[00:18:28] Yes, ma'am.
[00:18:28] So we will be using Bicrypt BZR.
[00:18:33] by crypto library in the backend.
[00:18:36] So whenever a user, let's say a user is registering for the website, I will also explain the login letter.
[00:18:44] So whenever a user is registered, he will be sending the data like a name, name, occupation and details and password.
[00:18:52] The password which will be the user sent is a raw password, like it will be a string and it can be like if we will the request.body we will be finding the password.
[00:19:04] So when we are saving this password, like we will be create, when we are create, saving the password, we will be hashing the password using the bycrypt.hash password and the rounds, salt rounds, hash salt rounds.
[00:19:20] So we will be saving this password.
[00:19:23] When a user logins, he will be sending the raw password again to the backend and we will be comparing the hash password as well as the raw password.
[00:19:32] So if this, like if this, if this, like if it is valid, it will be passed or else not.
[00:19:46] Okay.
[00:19:48] So you have used JWT tokens authentication and all?
[00:19:53] Yes, ma'am.
[00:19:54] How does that work?
[00:19:56] So JWT authentication token, it would
[00:20:00] be taking a object to store like when we decode that
[00:20:07] JWT token we need some information so it will be storing that information object as well as it will be taking one
[00:20:17] hash a secure hash string which is unique and which is unique so it will be taking two things and it will be hash
[00:20:30] make a token starting with EY something.
[00:20:33] So EY token, the token creation is done in the backend when a user logins the backend such as for the user and embeds the user data into the
[00:20:46] token as well as token and as well as hash the token using the
[00:20:55] like make the token using the secure hash as well as the user token.
[00:21:00] So this token is sent to the front end.
[00:21:03] The front end will see this token and save this in the session storage or else HTTP only cookie.
[00:21:10] And whenever a request is made, this token is used to verify the user in the backend.
[00:21:19] That's how it would be.
[00:21:25] Okay, so have you used any large language models?
[00:21:31] Large language models.
[00:21:32] I did use only few of them like Quincoder or else Olama models.
[00:21:41] Not that much large like only at 8 billion tokens etc.
[00:21:48] Okay, so which AI tool that you use on a regular basis?
[00:21:54] What AI tools you use?
[00:21:57] Yeah, I would be using if the project is on in my own laptop and the parameter query is shorter than a normal sentence or etc.
[00:22:10] I would be using normal models.
[00:22:12] So if it is going out of bounds, I would be using at most.
[00:22:23] 16 billion parameters or else 32 billion parameters.
[00:22:26] The model would be like a quincoder or else any online free tier models.
[00:22:36] So you mean to say Olamar models you have tried in your laptop itself?
[00:22:41] Yes, ma'am.
[00:22:42] So do you know like what how much parameters they have?
[00:22:48] Yes, ma'am.
[00:22:49] At most, 8 billion parameters or 7 billion parameters.
[00:22:53] Like it would be, they have more number of parameters, but the system's RAM and configuration would be low.
[00:23:03] So using 8 billion.
[00:23:06] Why have you used them?
[00:23:07] Like give a use case.
[00:23:12] Okay, so what purpose did I use it?
[00:23:14] Yeah.
[00:23:18] So I did use the Olama model for my first chatbot, like my first project I did months ago.
[00:23:27] And the query is like very short.
[00:23:29] So I did use Olama in my offline model.
[00:23:33] Just a simple chart, but I didn't write it.
[00:23:37] What kind of questions you will ask the model?
[00:23:43] General questions like some explain about this thing, explain about the meaning of some words and etc.
[00:23:56] So instead of Volama you could have used Gemini models itself right?
[00:24:01] They are free available for free and all.
[00:24:04] Yes, ma'am.
[00:24:04] I did use the Gemini model, but there is an issue in free tier models.
[00:24:09] Like when the usage of the models is high, it will be blocking the free tier request.
[00:24:16] So I did not.
[00:24:20] So have you tried fine tuning the models, whatever the models that you have used?
[00:24:25] Have you tried fine tuning them or use them as such?
[00:24:32] No, ma'am.
[00:24:33] I didn't fine-tune it.
[00:24:36] The only thing...
[00:24:38] No, ma'am.
[00:24:38] I didn't fine-tune, like, I only placed a little what is your role and what to do, etc.
[00:24:44] Okay.
[00:24:52] Okay, from the front end, when you are sending a password,
[00:25:00] the backend for verification.
[00:25:02] So in your payload, will you send the password directly or is there any way that you can hide the passwords?
[00:25:15] No, ma'am.
[00:25:15] I did thought about it, but never find it.
[00:25:21] So you pass the password directly to the backend.
[00:25:26] That is what you have done so far.
[00:25:29] Yes.
[00:25:36] Yes, Ashwant, I think I'm done with my interview.
[00:25:39] Do you have any questions for me?
[00:25:42] No, ma'am.
[00:25:43] Okay.
[00:25:46] Thank you, Ashwant.
[00:25:47] Yeah, thank you, Ashwant.
[00:25:49] And we will let you know the results through your institute.
[00:25:51] Okay, you can coordinate with them.
[00:25:54] Okay, thank you.
[00:26:01] Hi, can I get any feedback for these two candidates?
[00:26:06] We will discuss with our team and then I will inform Cathy about it.
[00:26:14] We have a few more candidates to be attending tomorrow, right?
[00:26:18] Yeah, yes.
[00:26:19] Yeah, so we'll complete all this and then we'll give you the result through the Google sheet that she has, you know, gave us.
[00:26:27] Sure.
[00:26:28] Thanks.
[00:26:29] Have a great day.
[00:26:30] Yeah, tomorrow also you can arrange at the same time between five to six.
[00:26:34] Yeah, sure.
[00:26:35] Thank you.
[00:26:36] Thank you.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment