RUN LINUX! May released

May’s issue of RUN LINUX! should be arriving in subscribers’ post any day now. This month you’ll see how to use GParted to manage your system’s partitions with confidence. Ex-Windows users often treat partitions with a bit of awe, but you’ll see that with a tool like GParted it’s easy to reorganise your partitions the way you want them; whether by creating, deleting or even resizing existing partitions. It can be amazing how much disk space is sitting around unused because it’s been allocated to a partition for a long-dead system.

Desktop Publishing (DTP) software can be extremely expensive. In this month’s RUN LINUX! you’ll discover the Scribus open-source DTP that has been specifically designed for use on Linux systems. Despite being free, this system offers most of the features of the big commercial solutions. In fact, it’s so good that some newspapers in the US use it as their sole layout application.

We’re often asked which media player we use on Linux. The answer may surprise you; MPlayer has long been our weapon of choice and once you get the hang of it you’ll see why. In RUN LINUX! this month, you’ll see how to get started with this deceptively-powerful application. The fine and precise navigation controls are a big plus, and for those who enjoy foreign films, the ability to adapt the speed of the subtitles to the frame rate of the film is a winning feature.

As usual, a number of the other tips and tricks are included and some of the more interesting questions provided by our readers are answered. One thing, we’d certainly recommend checking out is Linux’s shredding facility, that’ll prevent others recovering your deleted docs.

RUN LINUX! is written by experts from Siniatech and is published by Agora Business Publications. To find out more, or subscribe, visit the RUN LINUX! website here.

Fed up of filter(_.isInstanceOf).map(_.asInstanceOf) in Scala?

I can’t tell you how many times I’ve written code of the form filter(_.isInstanceOf).map(_.asInstanceOf) to extract items with a common type from a mixed list. However, I also can’t tell you many times I have found a better way and then forgotten it! In the hope that someone else might find it useful, and also as an aide memoire for us here at Siniatech, I decided to put it here for quick reference.

The simplest solution I have found is to use a collect with a single case statement which filters on a given type and implicitly produces a list of that type. This produces a statement that is less than half the size of the original, and if anything is clearer in its intent. This can only be a good thing! So:

mixed.filter(_.isInstanceOf[Cat]).map(_.asInstanceOf[Cat])

Becomes:

mixed.collect{ case t : Cat => t }

Which to me at least is a lot more readable. In case my explanation is a bit terse, I’ve included an example, with a test case below:

trait Pet {}
 
class Dog extends Pet {}
 
class Cat extends Pet {}
 
class TestCollect {
  import org.junit.Test
  import org.junit.Assert._
 
  @Test
  def shouldFilterList {
    val fido = new Dog
    val rex = new Dog
    val tom = new Cat
    val sylvester = new Cat
    val tiddles = new Cat
    val mixed = List(fido, tom, 3, rex, "hello", sylvester, tiddles)
    assertEquals(List(fido, rex), mixed.collect{ case t: Dog => t })
    assertEquals(List(tom, sylvester, tiddles), mixed.collect{ case t: Cat => t })
  }
}

Generating a checksum from an unordered collection

Recently, I was trying to work out how I could generate a single checksum for a collection of data. We only wanted something to quickly determine that large sets of data were equivalent, and weren’t looking to spend a lot of time creating an intricate domain-specific scheme. When Googling around the issue, however, it seemed that most of the proffered solutions were either too complicated – given the time we wanted to spend – or assumed that the data was ordered in some way. As I was going to be dealing with large sets of memory I didn’t want to have to suffer the constraints of sorting my unordered data (neither in terms of time nor memory).

Fortunately, I happened to chat to a friend about the problem, and he reminded me about the commutativity of XORs. Using this operation, I would be able to produce a SHA for each data item and then combine the results to produce a checksum that was relatively collision-free, but reproducible regardless of order.

Given that the encoding methods from Apache’s common codec project can produce a byte array, the implementation was pretty easy:

object Checksummer extends (String => Array[Byte]) {
  import org.apache.commons.codec.digest.DigestUtils._
 
  def apply(s: String) = sha(s)
 
  implicit def toHexString(bytes: Array[Byte]): String =
    bytes.map(b => Integer.toString((b & 0xff) + 0x100, 16).substring(1)).mkString
}
 
object ChecksumReducer extends ((Array[Byte], Array[Byte]) => Array[Byte]) {
  def apply(c1: Array[Byte], c2: Array[Byte]) =
    c1.zip(c2).map(p => (p._1 ^ p._2).asInstanceOf[Byte])
}
 
class TestChecksummer {
  import Checksummer._
  import org.junit.Test
  import org.junit.Assert._
 
  @Test
  def shouldProduceSameChecksumRegardlessOfOrder  {
    val cs1: String = Set("a","b","c").map(Checksummer).reduceLeft(ChecksumReducer)
    val cs2: String = List("c","b","a").map(Checksummer).reduceLeft(ChecksumReducer)
    val cs3: String = List("b","a","c").map(Checksummer).reduceLeft(ChecksumReducer)
    assertEquals(cs1, cs2)
    assertEquals(cs3, cs3)
  }
}

I should imagine it’s reasonably obvious what’s going on, but to clarify, the Checksummer is a function that produces a sha from a string. Producing the string reliably can be an additional issue, but that’s another story. The ChecksumReducer function can then be used to reduce a list of checksums into a single summary checksum and the implicit method toHexString() can be used to give a human-readable value.

A simple test is given to show the functions in action. I hope someone else might find this useful, but I’m putting it here mostly to remind myself in case we need to do something like this again.

Keeping track of when records have been updated in Oracle

Recently we have been creating some processes for extracting data from an Oracle database. As we may be dealing with a lot of data, we were looking to reduce the quantity of data we were extracting each time. The best way of doing this seemed, fairly obviously, to only extract the data that’s actually changed since we last did an extract. Doing this required keeping track of when the data is modified.

We looked at a number of solutions, most of which involved adding a ‘last_updated’ column to each table. This column would then be populated through a variety of mechanisms: from simply making sure to update it every time you change the data, to adding some triggers.

While these solutions would be okay when there is a genuine domain-driven need to store this information, it seems poor design to do it this way – and include this clutter within your data – when you only need it for technical reasons. Fortunately, a little bit of exploration uncovered an Oracle feature that we had not seen before.

Every Oracle table has a number of pseudo-columns; one of these stores the change number for the row. This change is effectively a timestamp and can be queried to give an effective ‘last updated’ time for the row. Now, this timestamp is not entirely accurate, and you wouldn’t want to use it for anything mission-critical – it’s the transaction’s commit time that is recorded rather than the specific row update time – but this is more than adequate for an application like ours. The data is consistent, if not precise, and we are able to use it to identify those records that have changed since the last extract.

To get this working we define a view on each table we’re extracting from:

CREATE OR REPLACE VIEW TAB_TO_EXTRACT_VW AS
SELECT
  id,
  SCN_TO_TIMESTAMP(ORA_ROWSCN)
FROM
  TAB_TO_EXTRACT;

Then at extract time, we identify those ids that correspond to changed records (by querying the view) and then extract the rows with those ids.

RUN LINUX! April released

April’s issue of RUN LINUX! should be arriving in subscribers’ post any day now. This month you’ll see how to get set up with Ubuntu One. As well as showing how to store up to 5GB of data in the cloud for free from your Ubuntu system, April’s issue will show you how to get the system set-up on your Windows system and Android phone; keeping all of your important data synchronised between the machines that you use. The web interface is also introduced, showing how Ubuntu One allows you to access your files from pretty much anywhere.

With the recent financial crisis, money is short for everyone. GnuCash is an awesome open source application that enables you to import and analyse your financial records. Most Internet banking sites support some form of data export now, and we bet that after putting a few month’s data into GnuCash you’ll be surprised at the results, and you’ll be able to identify a few places where you can make savings; RUN LINUX! shows you how.

OpenShot is video editor capable of adding all sorts of effects to your videos, but in this month’s issue you’ll see how to add starting titles and music to a home video. Additionally, we’ll look at handling special characters, identifying your Linux version and correcting your screen resolution. Elsewhere, there are a few tips on how you can improve your interaction with the computer by configuring some of the accessibility options to suit you.

RUN LINUX! is written by experts from Siniatech and is published by Agora Business Publications. To find out more, or subscribe, visit the RUN LINUX! website here.

Using VisualVM to monitor a remote JBoss instance

Why does everything in software development turn into a bit of a faff? I was looking at doing a relatively simple task – monitoring the memory usage of a JBoss server – but I couldn’t just start up VisualVM and point it to my JBoss instance; there were all sorts of hoops I needed to go through. As it took me a little while to figure out, I thought I would post the process I followed here; partly to help anyone else in the same boat, but also to remind me in case I need to do the same thing again!

The first stumbling block I ran into was that, in order to monitor remotely, it was necessary to run Java’s jstatd on the server in order for VisualVM to collect the data it needed. So dutifully tried to start the daemon, but got the following error:

Could not create remote object
access denied (java.util.PropertyPermission java.rmi.server.ignoreSubClasses write)
java.security.AccessControlException: access denied (java.util.PropertyPermission java.rmi.server.ignoreSubClasses write)
        at java.security.AccessControlContext.checkPermission(AccessControlContext.java:323)
        at java.security.AccessController.checkPermission(AccessController.java:546)
        at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
        at java.lang.System.setProperty(System.java:725)
        at sun.tools.jstatd.Jstatd.main(Jstatd.java:122)

After a bit of googling, I discovered that it was necessary to use a particular security policy to allow the daemon the required access. I created a file called ‘visualvm.policy’ and added the following:

grant codebase "file:${java.home}/../lib/tools.jar" {
     permission java.security.AllPermission;
};

I tried again to launch the daemon, specifying the security policy to use from the command line, but now I ran into another error:

Could not bind /JStatRemoteHost to RMI Registry
java.rmi.ConnectIOException: non-JRMP server at remote endpoint
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:230)
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184)
        at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:322)
        at sun.rmi.registry.RegistryImpl_Stub.rebind(Unknown Source)
        at java.rmi.Naming.rebind(Naming.java:160)
        at sun.tools.jstatd.Jstatd.bind(Jstatd.java:40)
        at sun.tools.jstatd.Jstatd.main(Jstatd.java:126)

This time it looked like there were issues with the port being bound to. I tried a few things suggested on the ‘net, but after having no luck, realised I could work this out from the JBoss logs. Specifically the line:

2012-03-11 19:00:51,638 INFO  [org.jboss.web.WebService] (main) 
    Using RMI server codebase: http://XX.XX.XX.XX:8083/

Using this I was finally able to get the daemon started with the following command:

jstatd -p 8083 -J-Djava.security.policy=visualvm.policy

So now I tried to connect from my client. Still no luck!

Then I spotted a blog entry from Gunnar Hillert and the information contained within helped me enormously. It turns out that if you’re using raw IPs it’s necessary to add some additional Java options in your configuration file (run.conf). Gunnar notes that the last is of particular significance:

JAVA_OPTS="$JAVA_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,address=8787,
    server=y,suspend=n"
JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote.port=6789"
JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote.ssl=false"
JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
JAVA_OPTS="$JAVA_OPTS -Djava.rmi.server.hostname=XX.XX.XX.XX

Having made this change, I was now able to connect from VisualVM and monitor the heap. This involved:

  1. Right-clicking on Remote in the applications pane, selecting Add Remote Host from the context menu and adding the necessary details. After a few seconds the server appeared in the list of remote applications.
  2. Right-clicking on the new server, selecting Add JMX Connection and entering the connection details (the port number to use is derived from the properties above – in this case 6789). After a few seconds an entry appeared beneath the server – double-clicking this began the monitoring.

Note: I was trying to monitor memory usage during the server initialisation. However, I found that if I started the jstatd too early, the server would blow up. After a bit of thought I realised that I needed to wait for the ‘Using RMI server codebase’ log message – that I referred to above – to appear before starting the daemon and then connecting from VisualVM.

Troubles with Code Coverage in Java 7

Here at Siniatech we’ve been looking at kicking off a new project that is mostly going to be dealing with a whole load of file management. While we’ve been involved in a lot of Scala development recently, we thought we’d give Java 7 a bit of a spin as its NIO 2.0 features seemed to provide a lot of new functionality that was just going to make our life a whole lot easier (let’s face it, scala.io is not the best bit of Scala).

Additionally, we thought we’d use this opportunity to consider using a code coverage tool from the ground-up. I’ve always been in two-minds about such tools and the value they provide. I can see that they can ensure that you’ve tested everything, but not necessarily that you’ve tested it well. I’ve generally felt that a good developer would write good tests that covered the important stuff with or without a coverage tool, but a poor developer will simply write tests to game the system by writing tests to ensure that every line of code is used rather than the system actually behaves how it’s supposed to.

Surely the development of tests is partly there to help a developer spot their misunderstandings, and I’m worried that coverage tools simply encourage you to compound those. When I write a test I do so from a completely functional point of view, i.e. I write a test to cover use cases of the class (and no, I’m not talking about integration tests – just behavioural tests). When I’ve seen people use coverage tools before, I have found that they simply write one or more tests for each method to cover the various branches without really thinking about what the class actually ‘does’.Despite my reservations, we have obviously used coverage tools before, but mostly these have been brought in part-way through a project, so I decided it was worth giving coverage a go on a new project, and using it properly from the start.

Anyway, so once we’d decided to go ahead with Java 7 and code coverage, we got to work cutting some code and created some Maven projects. We plumbed in Cobertura and all seemed to go swimmingly at first, with the plugin able to generate some coverage statistics for some relatively simple classes and tests. However, as the number and size of the tests grew, we started to hit some issues; all the tests would pass during a normal run, but as soon as Cobertura added its instrumentation we were seeing stack frame errors everywhere. We pushed on trying various configurations assuming we were having some finger-trouble, but before long we turned to Google and discovered that Cobertura wasn’t actually compatible with Java 7. This seemed a bit odd to us given that Java 7 has been out for a while now, but a bit more research seemed to show that actually there were no code coverage tools available for Java 7 unless we were prepared to pay for Atlassian’s Clover. A bit perturbed with the whole thing we ended up giving up on it for a while, especially as we were busy with other projects.

Obviously, I was a bit dumbfounded that their could be no way to get code coverage working with Java 7, so I continued Googling when I got the change and had a closer look at the various coverage tools available. Eventually I came across JaCoCo. At first I didn’t hold out much hope, especially as it’s part of the EclEmma suite which I’d already tried without success. However, I observed that the latest release was from January 2012 and there did appear to be some hints that it was compatible on the fairly sparse website. That website didn’t contain a great detail of information on how to plumb into Maven either, but I managed to cobble together something that appeared to work. However, whenever I tried to generate the report, an IOException was thrown complaining about a missing file!

Back to Google once more, and having trawled numerous websites (mostly aimed at getting JaCoCo working under Sonar, which wasn’t what I wanted), I was able to piece together the following XML. I now have a working coverage report, but given the length of time it took I to get there I thought it was worth posting so that it might help others in need. I’ll report back later on whether we find using this tool with our new project to be a help or hindrance.

Note that, Eclipse reports an error on line 10, but despite this the pom is able to install and generate the reports successfully – even under Eclipse. If I get a chance I’ll investigate and update.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<build>
  <plugins>
    <plugin>
      <groupId>org.jacoco</groupId>
      <artifactId>jacoco-maven-plugin</artifactId>
      <version>0.5.6.201201232323</version>
      <executions>
        <execution>
          <goals>
            <goal>prepare-agent</goal>
          </goals>
          <configuration>
            <propertyName>coverageAgent</propertyName>
          </configuration>
        </execution>
      </executions>
    </plugin>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-surefire-plugin</artifactId>
      <configuration>
        <argLine>-Xmx256m ${coverageAgent}</argLine>
      </configuration>
    </plugin>
  </plugins>
</build>

RUN LINUX! March released

March’s issue of RUN LINUX! should be arriving in subscribers’ post any day now. This month there is a special issue included that deals with getting your Ubuntu machine a permanent presence on the Internet.

The issues of dynamic IPs are discussed, and Dynamic DNS services are presented along with the ddclient for automatic IP updates in Linux. Once a machine is accessible, the next thing to do is to get some servers installed, and here the installation and configuration of OpenSSH and vsftpd is run through in detail. Guidance on how to access these servers from the command line is given, and tools like PuTTY and FileZilla are also introduced.

Also in the special issue, we look at the use of bandwidth tracking tools that can be used to ensure that excessive use of the new server is not going to lead to a costly bill from your ISP at the end of the month.

In the regular issue, there is also an exciting range of topics covered: from protecting your passwords to tracking down your ancestors.

Password protection has recently become a hot topic following the exposure of vulnerabilities with practically every browser’s password management capabilities. KeePassX is looked at, and the best ways for protecting your passwords discussed.

Getting webcams working on Linux has never been that easy, so we look at common problems in this area and do a bit of troubleshooting. We also show how to take advantage of your webcam and look at avoiding costly phone bills by using Skype to contact friends and family.

Keeping with the theme of family connections; making contact with family long since gone has seen a bit of a resurgence lately with TV programs like ‘Who Do You Think You Are’ making researching the family tree a popular hobby. We look at the open-source Gramps genealogy tool that is now considered to be one of the best ways to keep track of all your research.

Elsewhere, the regular issue also covers such topics as: getting started with LibreOffice Base, using Unity 2D, recording video messages and using the BURG boot loader.

RUN LINUX! is written by experts from Siniatech and is published by Agora Business Publications. To find out more, or subscribe, visit the RUN LINUX! website here.

 

RUN LINUX! February released

February’s issue of RUN LINUX! should be arriving in subscribers’ post any day now. The quest for ever more powerful computers leads to ever more powerful processors, but with power comes heat. This issue of RUN LINUX! shows how to hook up monitoring tools to the hardware sensors present in most modern PC components. Excessive heat can cause severe damage to your computer so monitoring your system is the first step in preventing your computer suffering; if a problem is discovered, there are some simple steps that can be taken to provide immediate relief.

Many people these days are taking videos that are recorded onto a digital media with a video camera or a mobile phone, but many of these end up left practically unwatched sitting on a hard disk somewhere. Linux offers some great tools to covert these into a DVD with a suitable menu and this month’s issue shows how it’s done.

Elsewhere, a thorough examination of the privacy tools available on Ubuntu (and elsewhere) is presented. From securing your files with TrueCrypt to using PGP for secure email, it’s vital to prevent your data falling into the hands of malicious hackers. Additionally, Google Chrome’s privacy mode, ‘Incognito’ is demonstrated.

RUN LINUX! is written by experts from Siniatech and is published by Agora Business Publications. To find out more, or subscribe, visit the RUN LINUX! website here.

Website for local playgroup goes live

In our free time the folks here at Siniatech have given the website for the Dacre Banks Pre-School Playgroup a bit of an overhaul. It’s nothing too fancy, just a WordPress-based site that will allow the staff and committee members of the playgroup to keep the parents of their children up-to-date with their latest news.

You can view the site in it’s entirety at dacrebanksplaygroup.org. This work was completed pro bono as the playgroup is a charity.


enquiries@siniatech.com Company number: 7618179