Scala Basics for Python Developers



home · about · subscribe

October 18, 2014 · -

Python is great language, its syntax, standard library and scientific computing stack(numpy, scipy, scikit-learn, matplotlib and many others) are just great. I feel like whenever I have a problem at my hand, it will be my first to go language no matter what with the extensive library support and community. However, as every programming language has its own shortcomings, Python has its own as well. It is not well-suited concurrent programming and parallelism, it is slow(comparing to JVM based languages), its support for functional programming is limited to map function and functools module. Also, with rising big data, both Hadoop and Spark favor JVM based languages(Java and Scala), for large scale data processing, Python’s longtime data processing disadvantages increased by one. You could use interfaces through Jython but development of Jython is lacking and sometimes due to version differences you may not use the feature-full Python with Python.

In this post, I will make a comparison in variables, functions and classes of both languages. Admittedly, they cover very easy constructs but my hope is that it would be a good start for Python developers who want to learn more about Scala.

Scala

If you have not heard of Scala, it is JVM based, modern programming language, succinct syntax and supports both object-oriented and functional programming style with many more advanced features which I will tell a little bit about shortly.

Why

Mainly for Big Data to be honest. Adoption of Scala from companies is also good but especially Scalding and Spark use Scala, I thought I should give it a shot Scala as well as I have been playing/using Clojure for some time and I open-sourced a K Nearest Neighbor classifier in Clojure if you are interested) , enjoying it so far. Also, I found very powerful to be able to combine a Java library with Clojure as it is pretty seamless with Leiningen. What I like about this hybrid approach, you get to use all of these mature libraries written in Java where you write Clojure on top of it.(In this case, Scala). I think using JVM under the hood and building on top of long-time

Advantages

Disadvantages

val xmlRepresentation =  <p><a href="http://bugra.github.io/">Bugra Akyildiz</a></p>

If this is not a disadvantage per se, it is weird. I would understand JSON but XML, really Scala? I thought you are superior to Java, but it turns out that you are Java in some aspects!

Installation on Mac OS X

Homebrew

If you do not have homebrew in Mac OS X, use the following command to install first the package manager:

ruby -e “$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)

Scala

Then, you could install scala via home-brew:

brew install scala

After installing the scala, you could type scala to the command line in order to start the interactive REPL.

Sbt

In the meantime, you want to install sbt(scala build tool), similar to maven and ant in Java if you are familiar with one of those.

brew install sbt

IDE Support

Scala has official support for Eclipse, one can use Intellij IDEA for Scala through Plugin. I am using Intellij, so I installed plugin(Preferences -> Plugins -> Scala). Installation is quite straightforward.

Variables in Scala

There are two different ways to define variables in Scala. First one is immutable(which is a nice fit for functional programming style) and the other one is mutable ones(surprise!). Immutable data structures cannot be changed after variables are assigned whereas mutable data structures could be changed. In order to define immutable ones, one can use val and for mutable ones var. Unlike Python, you could have a mutable and immutable string or integer, whereas in Python you have a predefined datasets which are immutable(string and tuple) and mutable(list and dictionary).

// String
val firstName: String = "Bugra"
// We could leave the type, and type inference takes into place
val firstName = "Bugra" // firstName is still string
// Similarly, we could always remove the type in the declaration
// part and expect it will be inferred from the value
// Integer
val firstPrimeNumber = 2
// Double 
val doubleNumber = 3.0
// Long
val longNumber = 5L
// Up to this point, if you remove the types and val, everything 
// works in Python similarly as well

// Characters are represented '' as in Java
val firstChar = firstName(0)
// Symbols are useful string representation for string interning
// also makes it easier to compare two strings
// There is no symbol equivalent for Python
val symbol = 'ha
// null, same null in Java, similar to None in Python
val non = null
// List (First get Range and then convert it into list)
val numbers = (1 to 100).toList
// This works too
val anotherNumbers = List(0, 1, 2, 3, 4, 5, 6, 7)
// Tuple(very similar to Python, can contain different data structures and immutable)
val tuple = (firstName, doubleNumber, non, numbers)
// Access to tuple elements use an interesting syntax and 
// starts with 1
tuple._1 // => returns firstName
tuple._2 // => returns doubleNumber and so on
// XML(!), needless to say, Python does not have equivalent 
// of this
val xmlRepresentation =  <p><a href="http://bugra.github.io/">Bugra Akyildiz</a></p>
// Unit type, which is almost same void in Java
val unitType : Unit = ()
val unitType = () // same as above, type inference
// Dictionary Like Map which could have different types in their names and keys
val blueWhale = Map("name" -> "Blue Whale", "weight" -> 170, "height" -> 30)
// To access key, same as Python(with a tweak)
blueWhale.get("name")
blueWhale.getOrElse("color", "blue") // returns blue

With type inference, Scala generally feels like a dynamic language even if it is strongly typed language when it comes to variables.

// Immutable 
val ii = 1
// Following gives: 
//<console>:9: error: value += is not a member of Int
//              ii += 1
//                 ^
// which means + operator is not defined on the immutable Int
ii += 1 
// Immutable data structures cannot be reassigned
// It gives the following error
// <console>:8: error: reassignment to val
//       ii = 3
//          ^
ii = 3 
// Instead use mutable with var
var ii = 0
ii += 1 // ii = 1

Variables in Python

In Python, there are some immutable data structures like tuple, string and some of them are mutable (list and dictionary) unlike Scala.

# string
first_name = "Bugra"
# integer
first_prime_number = 2
# double
double_number = 3.0
# long
long_number = 5L
"""
There is no character variable in Python, this is string as well
also note that Scala uses parenthesizes rather than square 
brackets to access elements of collections, strings, arrays 
"""
first_char = first_name[0]
"""
List (powerhorse of Python), very useful, can contain 
different data structures 
"""
numbers = range(10)
# Tuple
tup = (first_char, first_prime_numbers, numbers)
# indexing tuples and lists are same, with square brackets
print(tup[0], numbers[1]) // first_char, 1 
# Dictionary, JSON-like hash-maps of Python
blue_whale = {
                            'name': 'Blue Whale',
                            'weight': 170,
                            'height': 30,
                         }

Functions

There are a lot of ways to define functions in Scala whereas in Python there are two ways; functions and anonymous functions. This “there are a lot of ways” is common afterwards in Scala. We will see there are a lot of ways to define classes as well in the next section.

Functions in Python

You could define a function with a name, or define an anonymous function using lambda expression and then assign it to to a name. Lambda expressions are limited to single line and cannot accept optional parameters. So, in practice, anonymous functions are generally used ad-hoc.

def adder(x, y):
    return x + y
adder(3, 4) # 7
adder = lambda x, y: x + y
adder(3, 4) # 7 again

Functions in Scala

The same adder can be defined in scala following way(Other than types, it is very similar to Python)

def adder1(x: Int, y: Int): Int = {
    return x + y
}
//  We could leave out the return type and return statement
// Last statement automatically to be returned and return type 
// is inferred by the compiler 
def adder2(x: Int, y: Int) = {
    x + y
}
// The same function could be defined using anonymous function
val adder3 = (x: Int, y: Int) => x + y
// Since cases return values as well, we could use them as functions
val adder4: (Int, Int) => Int = {
    case (x, y) => x + y
}
// Curried Functions are interesting
def adderFactory(x: Int)(y: Int) : Int = x + y
// We could create our own adders using currying
val adder10 = adderFactory(10) _ 
adder10(5) // returns 15
// Closures could be created using this currying concept in functions

// If we want to use only side effects of function, we could do
// This will return `unit` and called 'procedure' as well
// This does not accept any parameters either
def printer = {
    println("I am printer")
}
printer // prints "I am printer"

Classes

Both Python and Scala support object oriented programming style. They do not enforce the usage of object oriented programming style unlike Java, though. In that aspect, they are similar. But as in the case of functions, there are many ways to create classes in Scala and it has also quite some syntactic sugars when it comes to commonly used constructs in classes.

Classes in Python

Python classes can be defined as in the following:


class Operation(object):
    """ Arithmetic Operations on two numbers
    """

    def __init__(self, x, y):
        """ Arguments:
                    x, y(number-like):
        """
        self._x = x
        self._y = y

    def add(self):
        """ Add two numbers
        """
        return self._x + self._y

    def sub(self):
        """ Subtract two numbers
        """
        return self._x - self._y

    def mul(self):
        """ Multiply two numbers
        """
        return self._x * self._y

    def div(self):
        """ Divide two numbers
        """
        result = None
        try:
            result = self._x / float(self._y)
        except ZeroDivisionError as e:
            print(e)
        return result

    @property
  def x(self):
        return self._x

  @x.setter
  def x(self, value):
    self._x = value

  @x.deleter
  def x(self):
    del self._x
    
    @property
  def y(self):
        return self._y

  @y.setter
  def y(self, value):
    self._y = value

  @y.deleter
  def y(self):
    del self._y

operation = Operation(3,4)
operation.add() # 7
operation.sub() # -1
operation.mul() # 12
operation.div() # 0.75
operation.x = 5
operation.y = 10
operation.add() # 15

In Scala, we could make a shortcut of the properties as in the following:

// Mutable, we want to change the values as in the following
// val operation = new Operation(3, 4)
// operation.x = 5 
class Operation(var x: Int, var y: Int) {
        
        def add() : Int = {
            this.x + this.y
        }
        
        def sub() : Int = {
            this.x - this.y
        }
        
        def mul() : Int = {
            this.x * this.y
        }

        def div() : Double = {
            val doubleX = this.x.toDouble
            val doubleY = this.y.toDouble
            doubleX / doubleY
        }
}

// object definition is very similar to Java
// comparing to Python, there is an extra "new"
val operation = new Operation(3,4)
operation.add // 7
operation.sub // -1
operation.mul // 12
operation.div // 0.75
operation.x = 5
operation.y = 10
operation.add // 15

All Rights Reserved

Copyright, 2020