What is the systematic process I sought the data having sorted the data epoch top thousand records which means I have not allowed the records to be chosen by themselves which means I have put my bias that is called bias if I in no way have tried to influence it if I in no way have tried to influence the selection of a record that is called unbiased by I stand unbiased does not relate to sampling error but when bias is their sampling error will be higher that is for sure but when unbias is there and in a given sample you get a higher sampling error that is because of the distribution probability distribution there is an always a chance that a sample may be far off from your actual mean there is always a chance but the chance of that happening when you take an unbiased which means random that chance will probably point zero one percent which is point zero one in proportion terms okay yeah makes sense?

So I hope what you got the sampling error sampling error is I take a sample the mean of that sample how far it is from the population mean that is called sampling error so we all need to be cognizant of the fact when we take one sample the value of that sample will not be exactly the population parameter estimate there will be some gap now how much gap you're willing to tolerate is question how do we eliminate or reduce that gap is another question if my tolerance of error is not high which means my sample size should be large if I have thousand because I take a sample having thousand records what will be my sampling error which means you virtually move you want zero sampling error take the whole population but again the fact of life is you cannot do anything and everything on the whole population that's why you will have to live with error that's fair probability chances likelihood of being wrong will come in to pick yeah now let's prove the central limit theorem I median is not up my median is a point estimator medium is a point estimator but median is not considered as a unbiased estimator median is a point estimate but usually as a like it has to be always try one 95% confident interval let's what is - okay again.

How much error you want to have situations you want again it's subject matter for example you you design a how many from mechanical industry well mechanical mechanical engineering how many of you done all right so something there was in a mechanical engineering they something taught was again probably it's common in first year of engineering okay some some examples sorry for people who from commerce or arts background some examples will come from science background because it becomes very easy to relate and typically first year is common to all so it is taught that I am drilling a hole okay so there is a probably a plate and second component has to fix on it which means all the holes on the second component and the first component where you want to not hold it there has to match but will need 100% match no okay so typically what you do is the underlying plate you do not make it round you make it like this ship do you agree why even that problem is there it will manage okay but now then the nut has to go into it you know that problem you saw it but now is the nut is too loose in that case problem so what happens is the and other component which has to come on top of that the hole is too big then what will happen.

What happens there is something called tolerance we all learn tolerance rate tolerance used to be given okay I want a nut of something to size plus or minus point zero five something like this issue you should be given you are you we all came across the word tolerance in engineering how many recollect right yeah acceptable here Q away in that problem whatever if this is the allowed fixtures will fit there will be no problem again that's that's the error this is the error right so depending upon the application you have to see yeah but these are something these are many of these concepts were actually taught to us way back yeah so we all have to see case-by-case basis what is the acceptable limit yeah so let's come to the central limit theorem central limit theorem states that irrespective of the shape of the distribution of original population we just compute for this this graph irrespective of the shape of the original population the sampling distribution of the mean will approach a normal distribution this is the sampling distribution of the mean why it is called sampling distribution of the mean.

Because we took sample we computed mean and then took the distribution of that mean that's why it is called sampling distribution of the mean will approach a normal distribution which is the graph which I showed like this a normal Bell ship as the size of the sample increases and becomes large okay the last point as the size of sample increases and becomes large what if I take samples which are just five observations and not 50 observations and then I take five observation and compute mean and then make a plot this graph probably you may not expect the sample size a small okay so that's what the point is saying as the size of sample increases and becomes large so sample size is true we launched sufficient sufficient okay and what is large number law of large numbers thirty observations or more is considered large but that is thumb rule but not in all cases that your observations will be considered large yeah and this discovery this famous discovery has been made by Linda Berg Levy and it is a milestone discovery why because based on this central limit theorem the entire statistics and statistics of sampling.

What we do is everything we do sampling and we take influence about the population it is completely based on this central limit theorem landmark discovery by Linda Berg okay now I'm going to prove fit simulation point by point what the central limit theorem is trying to say the law of large numbers and central limit theorem together these are the three important points which we are going to cover to using this entire simulation codes okay so all of those things will be covered the first point based on the law of large number.

Central limit theorem is mean of a sampling distribution that is mean of the sample mean will be equal to population mean provided you have sufficiently large number of trials sampling distribution of the mean approaches a normal distribution this is the second point that the sampling distribution will be a normal distribution and the third point which I have not yet spoken of this variance of the sampling distribution there is a variance in this right there is a dispersion right if there is a dispersion there is a variance so variance of the sampling distribution will be equal to variance of the population divided by n this variance will be equal to what variance I get here divide it by n and here we took was 50 in this case 50 if I compute this this will be of this distribution how many and you're going to build a model how many samples are you going to use to build a model I'll take one I know that why because I'm not going to build 100 models okay so how many trials should be there that in real life is not important because in real life.