One profiler for Java that I've tried out the last time is the profiler that is part of Visual VM which is a nice tool shipped with the standard JDK since 1.6. Visual VM can be used for a multitude of other things than profiling and if you haven't yet tried it you really should. It can be connected to both local and remote JVMs using jstat and JMX. I'm not going to dive into the details using Visual VM in general there are great resources for that out there.
What I want to show here is an example of a small profiling session. For this simple example a prime generator that writes it's results to disk will be the subject. I'll be using some Spring configuration and dependency injection since I wanted to try it out. That is completely irrelevant to the profiling though.
Step 1, a naive implementation
Below is the complete program including the spring configuration.// PrimeMain.java package primegenerator; import org.springframework.context.support.ClassPathXmlApplicationContext; public class PrimeMain { public static void main(String[] args) { ClassPathXmlApplicationContext beanFactory = new ClassPathXmlApplicationContext( "PrimeGeneratorApplication.xml"); PrimeGenerator generator = (PrimeGenerator) beanFactory.getBean("generator"); generator.generatePrimes(2000000); } }
// FileDestination.java package primegenerator; import java.io.FileOutputStream; import java.io.IOException; import java.io.OutputStreamWriter; import java.io.Writer; public class FileDestination implements IPrimeDestination { private final String fileName; public FileDestination(String fileName) { this.fileName = fileName; } @Override public void put(long prime) { Writer out = null; try { out = new OutputStreamWriter(new FileOutputStream(fileName, true)); out.write(Long.toString(prime) + "\n"); } catch (IOException e) { e.printStackTrace(); } finally { try { if(out != null) { out.close(); } } catch (IOException e) { e.printStackTrace(); } } } }
// IPrimeDestination.java package primegenerator; public interface IPrimeDestination { public void put(long prime); }
// PrimeGenerator.java package primegenerator; import org.springframework.beans.factory.annotation.Autowired; public class PrimeGenerator { @Autowired private IPrimeDestination destination; private boolean isPrime(long number) { for(long i = 2; i<Math.sqrt(number); i++) { if(number % i == 0) { return false; } } return true; } public void generatePrimes(long upTo) { long startTime = System.currentTimeMillis(); for(long i = 3; i < upTo; i++) { if(isPrime(i)) { destination.put(i); } } System.out.println("Done generating primes, time = " + (System.currentTimeMillis() - startTime) + " ms"); } }
// PrimeMain.java package primegenerator; import org.springframework.context.support.ClassPathXmlApplicationContext; public class PrimeMain { public static void main(String[] args) { ClassPathXmlApplicationContext beanFactory = new ClassPathXmlApplicationContext( "PrimeGeneratorApplication.xml"); PrimeGenerator generator = (PrimeGenerator) beanFactory.getBean("generator"); generator.generatePrimes(2000000); } }
// ..\PrimeGeneratorApplication.xml <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:aop="http://www.springframework.org/schema/aop" xmlns:context="http://www.springframework.org/schema/context" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd"> <context:component-scan base-package="primegenerator" /> <bean id="generator" class="primegenerator.PrimeGenerator" /> <bean id="destination" class="primegenerator.FileDestination"> <constructor-arg value="primes.txt" /> </bean> </beans>Generating all primes under 200000 takes about 12500 milliseconds with above application. To profile the application run kit with an upper limit high enough to allow for some profiling. In the left window your java process will show up and you can choose to connect to it. Then, using the sampler, and selecting CPU will result in a screen like that below.
To get the bast results from our optimizations we should hence focus on improving the file handling in the first stage.
Step 2, improved file handling
Let's change the file handling to only open the file when the program starts and close it when the program finishes. To do that we will update the FileDestination class as well as the spring configuration file and the main class to allow Spring to manage the life cycle of FileDestination so that the file is opened before the first write and closed when the program exits.
Updates as follows.
// FileDestination.java package primegenerator; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.IOException; import java.io.OutputStreamWriter; import java.io.Writer; public class FileDestination implements IPrimeDestination { private final String fileName; private Writer out; public FileDestination(String fileName) { this.fileName = fileName; } public void initBean() { out = null; try { out = new OutputStreamWriter(new FileOutputStream(fileName, true)); } catch (FileNotFoundException e) { e.printStackTrace(); } System.out.println("Init called"); } public void destroyBean() { try { if(out != null) { out.close(); } } catch (IOException e) { e.printStackTrace(); } System.out.println("Destroy called"); } @Override public void put(long prime) { try { out.write(Long.toString(prime) + "\n"); } catch (IOException e) { e.printStackTrace(); } } }
// PrimeMain.java package primegenerator; import org.springframework.context.support.ClassPathXmlApplicationContext; public class PrimeMain { public static void main(String[] args) { ClassPathXmlApplicationContext beanFactory = new ClassPathXmlApplicationContext( "PrimeGeneratorApplication.xml"); beanFactory.registerShutdownHook(); PrimeGenerator generator = (PrimeGenerator) beanFactory.getBean("generator"); generator.generatePrimes(200000); } }
// ..\PrimeGeneratorApplication.xml <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:aop="http://www.springframework.org/schema/aop" xmlns:context="http://www.springframework.org/schema/context" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd" default-init-method="initBean" default-destroy-method="destroyBean"> <context:component-scan base-package="primegenerator" /> <bean id="generator" class="primegenerator.PrimeGenerator" /> <bean id="destination" class="primegenerator.FileDestination"> <constructor-arg value="primes.txt" /> </bean> </beans>
Through the default-init-method and default-destroy-method in the application context we have now defined a method to be called on all beans when they have been created and initialized and one method to be called when the application context is destroyed. To make the application context aware of when the program is ended PrimeMain has been updated to register the application context with the JVM by calling the registerShutdownHook method.
With this update in place the time taken for all primes under 20000 is now down to 1870 ms. A dramatic speedup indeed! And somewhat surprising maybe since 1870 ms is far less than 30% (which was the time taken by the actual prime generation) of the time in Step 1. We may have reaped additional benefits from out improved I/O that were not obvious before. Examples of such may be improved caching, less context switching for I/O, etc.
With this update in place the time taken for all primes under 20000 is now down to 1870 ms. A dramatic speedup indeed! And somewhat surprising maybe since 1870 ms is far less than 30% (which was the time taken by the actual prime generation) of the time in Step 1. We may have reaped additional benefits from out improved I/O that were not obvious before. Examples of such may be improved caching, less context switching for I/O, etc.
Profiling the new application shows the following.
Indeed, the file handling is hardly noticeable now. If we want to improve the performance of this application further we should focus on the algorithm.
Indeed, the file handling is hardly noticeable now. If we want to improve the performance of this application further we should focus on the algorithm.
Step 3, improved algorithm
Lets switch that inefficient algorithm to a better one. This is a slightly tweaked version of the Sieve of Eratosthenes.
// PrimeGenerator.java package primegenerator; import org.springframework.beans.factory.annotation.Autowired; public class PrimeGenerator { @Autowired private IPrimeDestination destination; public void generatePrimes(int upTo) { long startTime = System.currentTimeMillis(); int N = upTo; // assume 2 and all odd integers >= 3 are prime boolean[] isPrime = new boolean[N + 1]; isPrime[2] = true; for (int i = 3; i <= N; i+=2) { isPrime[i] = true; } // mark non-primes <= N using Sieve of Eratosthenes for (int i = 2; i*i <= N; i++) { // if i is prime, then mark multiples of i as nonprime // suffices to consider multiples i, i+1, ..., N/i if (isPrime[i]) { for (int j = i; i*j <= N; j++) { isPrime[i*j] = false; } } } // Output primes for (int i = 2; i <= N; i++) { if (isPrime[i]) destination.put(i); } System.out.println("Done generating primes, time = " + (System.currentTimeMillis() - startTime) + " ms"); } }
Now the runtime for all primes under 200000 is down to 120 ms! That's less than 1% of the original runtime.
The profiler shows us the following.
Now I/O is once again the dominating factor. If we want to further improve the performance we should therefor focus on improving that.
I will settle with the current improvements and hope that this was good illustration of how to work with optimizations. Measure -> Tweak -> Measure -> Tweak -> ... Until good enough. Don't be tempted to perform optimizations on code that have not been proven to be a bottle neck. Code optimizations will often result in code that is less clear and harder to maintain.
Inga kommentarer:
Skicka en kommentar