{"id":5636,"date":"2025-07-28T07:02:23","date_gmt":"2025-07-28T07:02:23","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/07\/28\/why_onehotencoder_give_better_results_than\/"},"modified":"2025-07-28T07:02:23","modified_gmt":"2025-07-28T07:02:23","slug":"why_onehotencoder_give_better_results_than","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/07\/28\/why_onehotencoder_give_better_results_than\/","title":{"rendered":"why OneHotEncoder give better results than get.dummies\/reindex?"},"content":{"rendered":"<p>    why OneHotEncoder give better results than get.dummies\/reindex?<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p><strong>I can&#8217;t figure out why I get a better score with OneHotEncoder :<\/strong><\/p>\n<p>preprocessor = ColumnTransformer(<\/p>\n<p>transformers=[<\/p>\n<p>(&#8216;cat&#8217;, categorical_transformer, categorical_cols)<\/p>\n<p>],<\/p>\n<p>remainder=&#8217;passthrough&#8217; # &lt;&#8211; this keeps the numerical columns<\/p>\n<p>)<\/p>\n<p>model_GBR = GradientBoostingRegressor(n_estimators=1100, loss=&#8217;squared_error&#8217;, subsample = 0.35, learning_rate = 0.05,random_state=1)<\/p>\n<p>GBR_Pipeline = Pipeline(steps=[(&#8216;preprocessor&#8217;, preprocessor),(&#8216;model&#8217;, model_GBR)])<\/p>\n<p><strong>than get.dummies\/reindex:<\/strong><\/p>\n<p>X_test = pd.get_dummies(d_test)<\/p>\n<p>X_test_aligned = X_test.reindex(columns=X_train.columns, fill_value=0)<\/p>\n<\/p><\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/Due-Duty961\"> \/u\/Due-Duty961 <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datascience\/comments\/1mawalf\/why_onehotencoder_give_better_results_than\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datascience\/comments\/1mawalf\/why_onehotencoder_give_better_results_than\/\">[comments]<\/a><\/span>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    \/u\/Due-Duty961<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/www.reddit.com\/r\/datascience\/comments\/1mawalf\/why_onehotencoder_give_better_results_than\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>why OneHotEncoder give better results than get.dummies\/reindex? I can&#8217;t figure out why I get a better score with OneHotEncoder : preprocessor = ColumnTransformer( transformers=[ (&#8216;cat&#8217;, categorical_transformer, categorical_cols) ], remainder=&#8217;passthrough&#8217; # &lt;&#8211; this keeps the numerical columns ) model_GBR = GradientBoostingRegressor(n_estimators=1100, loss=&#8217;squared_error&#8217;, subsample = 0.35, learning_rate = 0.05,random_state=1) GBR_Pipeline = Pipeline(steps=[(&#8216;preprocessor&#8217;, preprocessor),(&#8216;model&#8217;, model_GBR)]) than get.dummies\/reindex: X_test [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,99],"tags":[3337,1141,3338],"class_list":["post-5636","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-datascience","tag-dummies","tag-get","tag-reindex"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/5636"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=5636"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/5636\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=5636"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=5636"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=5636"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}