The company has a broad customer base and a wide range of products. What is the best method of building my model/decision logic? I am not sure if these is even a stats question as I don't think a linear regression is going to work here.In this Regression Analysis in Excel project, you’ll be working with data from a company in the e-commerce sector, The Trendy Shopper-a fast-growing e-commerce business that offers a diverse selection of contemporary products across various categories, including fashion, electronics, and home decor. Then my predictor variables are a mixture of categorical and numerical, say gender, current age, ethnicity, number of cigarettes smoked per day, units of alcohol drunk per week, parents alive/dead etc. So.categorical dependent variable, it's not an age predictor just "lives to 90" or "does not live to 90". I have sample data from several hundred people and want to build a model or some kind of decision tree which predicts which category everyone outside the sample will fall in to. Okay.so suppose I want to split the population of a country in to two groups: those that will live to less than 90 years, and those that will live 90 years or more. I was thinking of a way of explaining better and maybe it's easier if I come up with an analogue study rather than try to explain the actual data I've got. Without that, the regression will likely not have very much meaning. You want to look at your categories and come up with some kind of numeric descriptor that can "measure" the difference (or differences) between categories. You haven't said what kind of data you are looking at, but I think that is the basic idea. If I were doing this for real, I might even try to find an underlying real variable that would be an indicator of size such as vehicle weight, or wheelbase, or engine displacement. It seemed obvious to me that the user would want to assign some kind of number to each category that would be related to size - small numbers for the compact, large numbers for full-size, with mid-size in between. It was clear to me that the categorization was trying to indicate size. In this problem the cars were categorized into compact, midsize, and fullsize (maybe more). I recall a question posed here where the user was doing something with cars. With categorical data, what does "X" mean in the analysis? We would be asking the same question if we were trying to do this in Minitab, or Matlab, or insert-favorite-programming-language-here. In many ways, I think this is the real question, and it is not necessarily an Excel question. What would the equation look like and what would it be saying? Thanks in advance, and sorry for the stats question but hoping someone can help. I thought it might be something to do with trying to run a linear regression with a categorical dependant variable.anyone got any advice? What would the equation look like and what would it be saying? Do I need to use another type of regression?Īlso whenever I try to run the analysis in Excel I get "Input data is non-numerical" - however it isn't, I checked it all, even tried replacing it with randomly generated 1's and 0's and still get the same problem. I am not sure a linear regression is possible though due to the categorical dependant variable. The independant variables are a mixture of categorical and numerical values, but at a push I could remove the categorical independant variables and just leave the numerical. I have quite a lot of sample data, about 390,000 rows in Excel. I have to try and find a formula to determine if a population of people belong in group A or B. Does anyone know if it's possible to perform a multiple linear regression in Excel when the dependant variable is categorical rather than numerical?
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |