What is Logistic Regression?

Asad Ashraf Karel
Nerd For Tech
Published in
4 min readFeb 19, 2021

--

If the prediction is under yes/no or true/false we mostly use this algorithm, which eventually declares with 1 or 0. Most importantly, we see the likelihood of one class over the other. The rule of the thumb here is that Logistic Regression cannot be used for the data having more than two classes.

In Linear Regression, the prediction is made in the range (-infinity, infinity), but in the case of Logistic Regression numbers of predictions are adjusted so that they lie between (0, 1) which is achieved using the sigmoid function converting a regression line into the sigmoid line.

The above formula is of the sigmoid function where the linear regression line is converted into the sigmoid line.

Hence we get our curve as:

Linear/ Logistic Regression

The infinity range has changed into 0–1. Here we predict only yes/no or True/False as 0/1, hence I said Logistic regression failed for more than 2 classes’ predictions. And the division of the class is declared by an appropriate threshold value.

Threshold value selection for classification

Two classes are specifically divided on the basis of their respective values on the x-axis. Blueline shows the x-axis values which helps to decide the threshold range, and the red line on the y-axis classifies the classes into the division. Accuracy depends upon the y-axis value, but ultimately for linear data x-axis shows the direction of selection. In Logistic Regression there are some important factors, by which actually the analysis is being done.

All the work is done by 4 independent concepts:

1. Probability: Probability is nothing but the chances of occurrence of event or class. And the probability is definitely affected by outliers. We have the range of probability is (0, 1). To scale the values into this range only, it is possible to do so, but smaller values will drastically go down in very low small index. They will not be comprehensible. To overcome the issue, mathematicians chose the method of Odds ratio.

2. Odds ratio: Then the Odds ratio gives the values in the range of (0, infinity). Again the infinity makes a little incomprehensibility in the values, hence again mathematicians decided to do loge to the Odds ratio.

3. Log of Odds ratio: This method gives the value in the range of( 0, infinity) . And to avoid incorrectness of the class prediction for logistic regression, we go with the Sigmoid function.

4. Sigmoid function: Sigmoid function curves all the values of the range of (-infinity, +infinity) into the (0, 1). Now predictions are accurately defined.

This is a little mathematics behind the Logistic Regression, which is very much important to understand the basics of it. We are precisely dealing with our regression line as we can see the term y into the equations. This is how Logistic Regression works in that it classifies any class either 1 or 0. As we had the model summary table in Simple Linear Regression, we even have the same summary in classification analysis. We just different inference here in the summary.

model summary table

Here we take the inference of the summary table, that Pseudo R-squ proves the accuracy of the model, where LLR p-value shows that at least one feature contributes into the model since p-value < 0.05, blue mark shows the impact of features into the model while green mark gives the explicit explanation of each feature’s contribution. The same way we predict here as well.

Advantages:

· Logistic Regression easier to implement

· This model is commonly used in casual scenarios.

· It is a good fast to classify the class.

· It is very efficient when the features are linearly separable, and it avoids overfitting.

· If we modify the algorithm we can extend this into multi-class classification, which is also known as multinomial logistic regression.

Disadvantages:

· For the higher dimensional data, this algorithm may face overfitting.

· If the data is non-linear, logistic regression cannot be used.

· It becomes complex if multicollinearity exists in independent variables.

· It is good up to the important and contributing features. Unnecessary feature inclusion may affect model accuracy.

· This model is very much sensitive to the outlier.

· To predict a good result, this model requires a large dataset to learn.

Applications:

· Email: Spam / Not spam

· Online Credit Card Transaction: Fraudulent (Yes/No)

--

--

Asad Ashraf Karel
Nerd For Tech

PG in Pyhsics and Data Science with Machine Learning and Engineering. Author with international publication(s).