Highest Voted 'scikit-learn' Questions - Artificial Intelligence Stack Exchange

6

votes

2 answers

Can ML be used to curve fit data based on dataset of example fits?

Say I have x,y data connected by a function with some additional parameters (a,b,c): $$ y = f(x ; a, b, c) $$ Now given a set of data points (x and y) I want to determine a,b,c. If I know the model for $f$, this is a simple curve fitting problem.…

machine-learning deep-learning scikit-learn

asked Sep 12 '19 at 20:29

argentum2f

181
1
7

5

votes

2 answers

Why isn't my decision tree classifier able to solve the XOR problem properly?

I was trying to solve an XOR problem, and the dataset seems like the one in the image. I plotted the tree and got this result: As I understand, the tree should have depth 2 and four leaves. The first comparison is annoying, because it is close to…

machine-learning python decision-trees scikit-learn

asked Jun 12 '20 at 20:34

Pedro Paiva

53
5

4

votes

0 answers

When computing the ROC-AUC score for multi-class classification problems, when should we use One-vs-Rest and One-vs-One?

The sklearn's documentation of the method roc_auc_score states that the parameter multi_class can take the value 'OvR' (which stands for One-vs-Rest) or 'OvO' (which stands for One-vs-One). These values are only applicable for multi-class…

machine-learning metric scikit-learn roc-auc multiclass-classification

asked Jan 06 '21 at 05:38

Leockl

161
1

2

votes

2 answers

integrating machine learning models on microcontroller

i converted my machine learning model (random forest regressor) into c code using 'emlearn' library but the size of .c file is 3.8 MB which is incompatible for microcontrollers. so i want my c code in KBs what should i do?

machine-learning decision-trees scikit-learn random-forests c

asked Nov 30 '24 at 07:02

Sakshi Sutar

21
1

2

votes

0 answers

How matrix factorization helps with recommendations when it converges to the initial user-items matrix?

We can say that matrix factorization of a matrix $R$, in general, is finding two matrices $P$ and $Q$ such that $R \approx P.Q^{T}$ with some constraints on $P$ and $Q$. Looking at some matrix factorization algorithms on the internet like…

recommender-system scikit-learn

asked Oct 10 '21 at 13:20

KindNewbie

21
2

2

votes

0 answers

Suitable deep learning algorithms for spatial / geometric data

I have a task of classifying spatial data from a geographic information system. More precisely, I need a way to filter out unnecessary line segments from the CAD system before loading into the GIS (see the attached picture, colors for illustrative…

deep-learning geometric-deep-learning graphs scikit-learn

asked Feb 28 '20 at 05:07

Oleg Bizin

121
2

2

votes

1 answer

Is it compulsary to normalize the dataset if doing so can negatively impact a Binary Logistic regression performance?

I am using raw data set with 4 feature variables (Total Cholesterol, Systolic Blood Pressure, Diastolic Blood Pressure, and Cigraeette count) to do a Binominal Classification (find stroke likelihood) using Logistic Regression Algorithm. I made sure…

machine-learning datasets logistic-regression batch-normalization scikit-learn

asked Sep 09 '19 at 14:18

GYSHIDO

51
4

1

vote

1 answer

How to train Scikit-learn model for classifying movements using accelerometer data?

I am working on a motion classification task using accelerometer data collected at 25Hz during different exercises. The goal is to classify movements such as: Pull-ups Push-ups Dips Each batch of data consists of 50samples (2 seconds), where each…

machine-learning scikit-learn

asked Feb 17 '25 at 14:54

Gripen

111
1

1

vote

1 answer

Why isn't class_weight='balanced' impacting my F1 score for an imbalanced dataset (SVM)

I'm using MNIST to test how a class imbalance can impact an SVM model. I have a training set with 50 examples of '0'. I then am increasing the number of '1' training examples (starting from 1 example of '1' up to 999 examples of '1' in the training…

support-vector-machine scikit-learn imbalanced-datasets

asked Feb 04 '25 at 04:24

Tyler Hilbert

157
7

1

vote

0 answers

Using ML to uncover procedural logic

The game Elite Dangerous has a proceduraly generated galaxy of some 400 billion star systems. Each star system in the game can be uniquely identified bu a 64bit number (id64) which is used as a seed for building the system but can also be decoded…

scikit-learn

asked Jan 27 '23 at 00:06

Paulo Rodrigues

11
1

1

vote

1 answer

Unexpected behaviour on using class weights in loss

I’m working on a classification problem (500 classes). My NN has 3 fully connected layers, followed by an LSTM layer. I use nn.CrossEntropyLoss() as my loss function. To tackle the problem of class imbalance, I use sklearn’s class_weight while…

deep-learning classification pytorch scikit-learn

asked Oct 25 '22 at 03:33

helloworld

65
1
6

1

vote

1 answer

Why does sklearn perceptron converge for linearly inseparable data points?

I learned that the perceptron algorithm only converges if the dataset is linearly separable. I am implementing this algorithm using scikit learn. The blue and orange points are from the training set, while red and green are from the test set.…

machine-learning classification python perceptron scikit-learn

asked Jul 27 '22 at 11:50

jacquesadit00

13
2

1

vote

1 answer

How can I interpret the value returned by score(X) method of sklearn.neighbors.KernelDensity?

For sklearn.neighbors.KernelDensity, its score(X) method according to the sklearn KDE documentation says: Compute the log-likelihood of each sample under the model For 'gaussian' kernel, I have implemented hyper-parameter tuning for the…

hyperparameter-optimization scikit-learn bayesian-optimization density-estimation

asked Mar 10 '22 at 10:44

Arun

255
2
8

1

vote

1 answer

Interpretation of feature selection based on the model

The description of feature selection based on a random forest uses trees without pruning. Do I need to use tree pruning? The thing is, if I don't cut the trees, the forest will retrain. Below in the picture is the importance of features based on 500…

machine-learning classification python scikit-learn

asked Jan 02 '20 at 17:44

user287629

45
4

1

vote

0 answers

How can I split the data into training and validation sets such that entries with a certain value are kept together?

I have the following kind of data frame. These are just example: A 1 Normal A 2 Normal A 3 Stress B 1 Normal B 2 Stress B 3 Stress C 1 Normal C 2 Normal C 3 Normal I want to do 5-fold cross-validation and splitting the data using skf =…

python cross-validation scikit-learn

asked Dec 23 '19 at 19:06

user1631306

83
6

Questions tagged [scikit-learn]