A very nice article explaining smoothsort!
http://www.keithschwarz.com/smoothsort/
Friday, December 12, 2014
Sunday, December 7, 2014
Cloud Design Patterns
http://msdn.microsoft.com/en-us/library/dn568099.aspx
"This book contains twenty-four design patterns and ten related guidance topics, this guide articulates the benefit of applying patterns by showing how each piece can fit into the big picture of cloud application architectures. It also discusses the benefits and considerations for each pattern. Most of the patterns have code samples or snippets that show how to implement the patterns using the features of Microsoft Azure. However the majority of topics described in this guide are equally relevant to all kinds of distributed systems, whether hosted on Azure or on other cloud platforms."
"This book contains twenty-four design patterns and ten related guidance topics, this guide articulates the benefit of applying patterns by showing how each piece can fit into the big picture of cloud application architectures. It also discusses the benefits and considerations for each pattern. Most of the patterns have code samples or snippets that show how to implement the patterns using the features of Microsoft Azure. However the majority of topics described in this guide are equally relevant to all kinds of distributed systems, whether hosted on Azure or on other cloud platforms."
Friday, November 14, 2014
Type System in Java
Unified type system. In a unified type system, all types including primitive types inherit from a single root type. For example, C# is unified typed, as every type in C# inherits from the
Type erasure. (to be completed later)
Generic types. Generics have been available in Java since J2SE 5.0 was released in 2004. Generics allow for parameterized types and hence allow the compiler to enforce type safety. For example, an instance of the
Arrays. Java’s arrays are parameterized types written in a form different from the generic types. For example, the array type
Type parameters. (to be completed later)
Type inference. Java doesn't support type inference for variable assignments. Hence, it requires that programmers declare the types they intend a method or function to use. However, it is not true that Java doesn't have type inference. Indeed, recent versions of Java supports type inference when using generics. In Java 7, one can get some additional type inferencing when instantiating generics like so
The reason why Java doesn't allow type inference for non-generic types seems to be because of its design philosophy. That is, programmers should write things explicitly to make sure that the compiler has the same understanding of the code as them do. Besides, Java was originally aimed at programmers coming from C++, Pascal, or other mainstream languages that did not have it. Thus, type inference was probably not supported due to the principle of least surprise.
Object
class. Java has several primitive types that are not objects. However, Java provides wrapper object types that exist together with the primitive types, so developers can use either the wrapper object types or the simpler non-object primitive types.Type erasure. (to be completed later)
Generic types. Generics have been available in Java since J2SE 5.0 was released in 2004. Generics allow for parameterized types and hence allow the compiler to enforce type safety. For example, an instance of the
Vector
class can be declared to be a vector of strings, written as Vector<String>
. To avoid major changes to the Java runtime environment, generics are implemented using type erasure. This amounts to removing the additional type information and adding casts wherever required. For example, there is no way to distinguish between a List<String>
and a List<Long>
at runtime. Since JVM doesn't track the type arguments of a generic class in the bytecode, both of them are of type List at runtime.Arrays. Java’s arrays are parameterized types written in a form different from the generic types. For example, the array type
String[]
is analogous to the vector type Vector<String>
. On the other hand, Java arrays are not subject to type erasure like generic types are, and this leads to inconsistency in the handling of arrays and generic types. One consequence of this is that the following code will not compile:class Test<T> { public Vector getVector(){ return new Vector<T>(); } // ok public T[] getArray(){ return new T[10]; } // compile-time error }Since arrays don’t support type erasure, method
getArray
needs the type information of T
to create the array. However, as Java generics are implemented using type erasure, the parameterized type T
does not exist at runtime. Hence, the compiler cannot assign a type to the array and thus will raise a "generic array creation" error.Type parameters. (to be completed later)
Map<String, String> foo = new HashMap<>();Java is smart enough to fill in the blank angle brackets for us. Moreover, we get type inference in Java 8 as a part of lambda expressions. For example, consider
List<String> names = Arrays.asList("Tom", "Dick", "Harry"); Collections.sort(names, (first, second) -> first.compareTo(second));The Java compiler can infer from the signatures
Collections#sort(List<T>, Comparator<? super T>)
and Comparator#compare(T o1, T o2)
that first
and second
should be a String, allowing the programmer to omit the type declarations in the lambda expression.The reason why Java doesn't allow type inference for non-generic types seems to be because of its design philosophy. That is, programmers should write things explicitly to make sure that the compiler has the same understanding of the code as them do. Besides, Java was originally aimed at programmers coming from C++, Pascal, or other mainstream languages that did not have it. Thus, type inference was probably not supported due to the principle of least surprise.
Sunday, November 9, 2014
Spark for Beginners
Setup Spark enviornment on the local Ubuntu machine
http://blog.prabeeshk.com/blog/2014/10/31/install-apache-spark-on-ubuntu-14-dot-04/
Write Spark locally with IntelliJ and running apps on the remote cluster:
http://blog.csdn.net/Camu7s/article/details/45530295
Run Spark apps on Windows without installing Hadoop:
http://qnalist.com/questions/4994960/run-spark-unit-test-on-windows-7
Compile and install Hadoop on Windows:
http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
http://blog.prabeeshk.com/blog/2014/10/31/install-apache-spark-on-ubuntu-14-dot-04/
Write Spark locally with IntelliJ and running apps on the remote cluster:
http://blog.csdn.net/Camu7s/article/details/45530295
Run Spark apps on Windows without installing Hadoop:
http://qnalist.com/questions/4994960/run-spark-unit-test-on-windows-7
Compile and install Hadoop on Windows:
http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
Install Dato on CentOS 6.4
Dato needs Python 2.7, while CentOS uses Python 2.6. So first you have to install Python 2.7 as an alternative build of Python on your system, as well as the libraries needed to compile Python modules:
sudo yum install -y readline-devel sqlite-devel bzip2-devel.i686 \ openssl-devel.i686 gdbm-devel.i686 libdbi-devel.i686 ncurses-libs cd /tmp # install zlib manually because the default one for centos is too old wget http://zlib.net/zlib-1.2.8.tar.gz tar -zxvf zlib-1.2.8.tar.gz cd zlib-1.2.8 ./configure make & sudo make install # install Python 2.7.6 wget https://www.python.org/ftp/python/2.7.6/Python-2.7.6.tgz tar -zxvf Python-2.7.6.tgz cd Python-2.7.6 # IMPORTANT! ./configure --enable-shared --enable-unicode=ucs4 # IMPORTANT! make & sudo make altinstallYou also need to install setuptools, pip, and virtualenv before running Dato. The installation steps however are standard, see e.g., here for details.
Saturday, October 11, 2014
Combiners in Hadoop MapReduce
A combiner function in MapReduce has the same form as the reduce function (and is an implementation of the Reducer interface), except its output types are the intermediate key and value types (K2 and V2), so they can feed the reduce function:
combiner: (K2, list(V2)) → list(K2, V2)
Often the combiner and reduce functions are the same, in which case K3 is the same as K2, and V3 is the same as V2. On the other hand, Hadoop reserves the right to use combiners at its discretion. This means that a combiner may be invoked zero, one, or multiple times. Hence, the correctness of a MapReduce algorithm shouldn't depend on computations performed by the combiner or depend on them even being run at all.
Sunday, October 5, 2014
Guidance for Scientific Writing
The Structure, Format, Content, and Style of a Journal-Style Scientific Paper
Typesetting mathematics for science and technology according to ISO 31/XI
According to the standard, constants like i and Euler's e, and the differential operator d should be upright. However, these typesetting rules seem to be ignored by many respected authors and publishers. See this thread for some interesting discussions on this issue: What's the proper way to typeset a differential operator? For those looking for an easy solution, one of the re-posts there suggests to typeset a math paper in Latex using package commath.
Typesetting mathematics for science and technology according to ISO 31/XI
According to the standard, constants like i and Euler's e, and the differential operator d should be upright. However, these typesetting rules seem to be ignored by many respected authors and publishers. See this thread for some interesting discussions on this issue: What's the proper way to typeset a differential operator? For those looking for an easy solution, one of the re-posts there suggests to typeset a math paper in Latex using package commath.
Subscribe to:
Posts (Atom)