Large scale data analysis using MLlib 
	Ahmed Hussein Ali, Maan Nawaf Abbod, Mohammed Khamees Khaleel, Mostafa Abdulghafoor Mohammed, Tole Sutikno 
	
			
		Abstract 
		
		Recent advancements in the internet, social media, and internet of things (IoT) devices have significantly increased the amount of data generated in a variety of formats. The data must be converted into formats that is easily handled by the data analysis techniques. It is mathematically and physically expensive to apply machine learning algorithms to big and complicated data sets. It is a resource-intensive process that necessitates a huge amount of logical and physical resources. Machine learning is a sophisticated data analytics technology that has gained in importance as a result of the massive amount of data generated daily that needs to be examined. Apache Spark machine learning library (MLlib) is one of the big data analysis platforms that provides a variety of outstanding functions for various machine learning tasks, spanning from classification to regression and dimension reduction. From a computational standpoint, this research investigated Apache Spark MLlib 2.0 as an open source, autonomous, scalable, and distributed learning library. Several real-world machine learning experiments are carried out in order to evaluate the properties of the platform on a qualitative and quantitative level. Some of the fundamental concepts and approaches for developing a scalable data model in a distributed environment are also discussed.
		
		 
	
			
		Keywords 
		
		big data; data analysis; machine learning; open source; parallel processing; spark MLlib
		
		 
	
				
			
	
	
							
		
		DOI: 
http://doi.org/10.12928/telkomnika.v19i5.21059 	
Refbacks 
				There are currently no refbacks. 
	 
				
		This work is licensed under a 
Creative Commons Attribution-ShareAlike 4.0 International License .
	
TELKOMNIKA Telecommunication, Computing, Electronics and Control 1693-6930 , e-ISSN: 2302-9293 Universitas Ahmad Dahlan , 4th Campus+62  274 564604
<div class="statcounter"><a title="Web Analytics" href="http://statcounter.com/" target="_blank"><img class="statcounter" src="//c.statcounter.com/10241713/0/0b6069be/0/" alt="Web Analytics"></a></div>  View TELKOMNIKA Stats