I'm trying to develop skills to deal with very small amounts of labeled samples (250 labeled/20000 total, 200 features) by practicing on Kaggle "Don't Overfit" dataset (Traget_Practice have provided all 20,000 Targets). I've read a ton of papers and articles on this topic, but everything I've tried did not improve simple regularized SVM results (best accuracy was 75 and AUC was 85) or any other algorithm result (LR, K-NN, NaiveBayes, RF, MLP). I believe the result can be better (on Leaderboard they go even over AUC 95)
What I've tried without success:
- Remove outliers I've tried to remove 5%-10% outliers with EllipticEnvelope and with IsolationForest. 
- Feature Selection I've tried RFE (with or without CV) + L1/L2 regularised LogisticRegression, and SelectKBest (with chi2). 
- Semi-Supervised techniques I've tried co-training with different combinations of two complementary algorithms and :100-100: split features. I've also tried LabelSpreading, but I don't know how to provide the most uncertain samples (I tried predictions from other algorithms, but there were many mislabeled samples, hence was unsuccessful). 
- Ensembling Classifiers StackingClassifier with all possible combinations of algorithms and this also didn't improve the result (the best is the same as SVM accuracy 75 and AUC 85). 
Can anyone give me advice on what I'm doing wrong or what else to try?
 
     
    