Wednesday, February 25, 2015

Parsing boring old CSV files with Java

It sometimes seems as if Java (JEE) skipped specifying a robust mechanism for handling CSV files. For instance, there is nothing like a schema description (XSD or DTD), which exists for XML files. Perhaps it was just too trivial, but I've found despite the existence of more "robust" formats like XML and JSON, there is still a lot of CSV files being used, which probably has a lot to do with the ubiquitness of Excel. But whatever the reason it would be nice to have a schema description of a CSV file, which could be used as sort of a interface description to bridge the gap between developers and business analysts.

I discovered something from Digital Preservation, which has created a very nice looking schema language for the CSV format; however, unfortunately, it was not well suited for Java. The library is written in Scala and comes with a Java bridge; however, at the time of writing this blog, the bridge didn't allow one to get much information about the exact errors and their positions because the result of the parsing was returned as a long, loosely formatted string.

There are other libraries, like Jackson CSV and Super CSV from sourceforge.net, which seem less sophisticated or ambitious than Digital Preservation's one; however, I wasn't impressed by them because the problem of bridging the gap between technical and business people doesn't seem to have been addressed. The building up of the CSV structure is largely done in Java code at run-time and is for a non Java person almost unintelligible.

So, for my last task, which involved parsing 3 types of CSV files, I resorted to using the apache common's  CSV parser and Java Annotations. What I did was to create a Pojo with all the columns contained in the CSV. Then, I invented some simple annotations which I used to decorated the Pojo with and which the business analyst could read and understand without much difficulty.

Below is a small example of one of these Pojos. You might laugh that I passed around a Java source code file as a CSV  file description but it worked well and people didn't seem to have problems finding what was relevant for them and understanding it. At first, it might look unintelligible; however, but with a little patience one can read through it.

Each Java class defines a CSV line, here one with 4 columns however in reality it was up to 396 columns. The lines' column order was described by the annotation @CsvColumnOrder (with Java reflection it's not possible to determine the order in which Field are declared; therefore, the extra annotation). The expected type of each column is described within the class definition as class attribute with further annotations like CsvChecked, CsvIsinCode, etc...


DbtPojo.java 
@CsvColumnOrder({
"IsActive",
"ExternalCode_ISIN",
"AmountIssued",
"PaymentDate"
});

public class DbtPojo {

  @CsvChecked(regex="^[YN]{1}", required=true)
   public String IsActive;

  @CsvIsinCode
  @CsvUnique 
  public String ExternalCode_ISIN;

  @CsvNumber(decimal = true)
  public String AmountIssued;

  @CsvDate
  public String PaymentDate;


}



Using reflection, it is possible during parsing to get information about relevant Pojo's fields by looking for the presence of  annotations. For instance, if the second column of a line has been returned by the Apache parser, then from the CsvColumnOrder annotation, one knows which field of the Pojo needs to be checked, namely "ExternalCode_ISIN".   With a reference to the Java Field, one can check for the presence of certain annotations by calling getAnnotation(Annotation class reference). If the annotation is present, here CsvIsinCode, one can react appropriately for a ISIN code.

I've included the code as a zip for those who are interested in more detail.

The reader should not be mislead into thinking that the CSV values from the files are being parsed into an instance of a Pojo as described above. In fact, the classes are never even instantiated. They are only used for their annotations and field definitions; that is for meta data purposes.


Tuesday, November 18, 2014

Authentication using simple Active Directory bind with Spring

Recently I wrote a web application, which had to be made secure.  I wanted to allow the users to use their normal Active Directory passwords and not to have to create one especially for the application. The application was also meant to be available to only to a selected group of users and not everyone with a valid AD account.

Rather than trying to get an Active Directory "application" account, which would have been nice as it would have allowed me to search for roles but which would have taken forever to get because of bureaucratic reasons, I decided to use a simple bind to the AD with the entered password and username to authenticate the users.

The problem was that just binding would have allowed everyone with a valid AD account to access the web application, which is what I didn't want. To limited the users, I needed a different way to verify whether or not they were authorized to use the application and this I did by storing their roles in the application's database.

To test whether or not this approach would work and to determine the connection details, I used Apache Directory Studio as shown below.





To add this approach into the web application, I used Spring Security, which I setup using the name space configuration (XML with specific Spring tags).

    <s:authentication-manager> 

        <s:authentication-provider ref="ldapAuthenticationProvider"/>   

    </s:authentication-manager>

    <bean id="ldapAuthenticationProvider"    class="org.springframework.security.ldap.authentication.LdapAuthenticationProvider">

        <constructor-arg>

            <bean class="eu.ecb.csdb.gateway.security.UsernameBindAuthenticator">

                <constructor-arg ref="contextSource" /> 

                <!-- just keep the base class happy -->

                <property name="userDnPatterns">

                     <list>

                        <value>sAMAccountName={0},OU=Standard User,OU=Users and Groups</value>

                    </list>

                </property>     

            </bean>

        </constructor-arg>

        <constructor-arg>

            <bean class="eu.ecb.csdb.gateway.security.DatabaseAuthoritiesPopulator">  

            </bean>

        </constructor-arg>

    </bean>


First, I configured my own authentication provider (UsernameBindAuthenticator) , which binds to AD using the entered password and username as shown here:


public class UsernameBindAuthenticator extends AbstractLdapAuthenticator {
/*...*/

    public DirContextOperations authenticate(Authentication authentication) {
  
        DirContextAdapter result = null;

        String username = authentication.getName();
        String password = (String)authentication.getCredentials();
  
        String domainUsername = domain + "\\" + username;

        DirContext ctx = null;

        try {

            ctx = getContextSource().getContext(domainUsername, password);

            // Check for password policy control

            PasswordPolicyControl ppolicy = PasswordPolicyControlExtractor.extractControl(ctx); 

            //bind was successful if we got here.

            result = new DirContextAdapter();

        } finally {
            LdapUtils.closeContext(ctx);
        }

        return result;              

    }
/*...*/

}


Then, I added my own LdapAuthoritiesPopulator (DatabaseAuthoritiesPopulator) , which queries the database to see if the user has any roles.

public class DatabaseAuthoritiesPopulator implements LdapAuthoritiesPopulator {

/*...*/

    public Collection<? extends GrantedAuthority> getGrantedAuthorities(DirContextOperations userData, String username) {

  
        logger.debug(String.format("Getting roles for %s", username));

        List<GrantedAuthority> result = new ArrayList<GrantedAuthority>();

        try {

            final String query = String.format("select roles from USERS where USERNAME = '%s'", username);     

            final String roles = this.jdbcTemplate.queryForObject(query, String.class);
    
            String splitRoles[] = null;
            if(roles != null && (splitRoles = roles.split(";")).length > 0){       
                for(String strRole : splitRoles){
                    result.add(new CsdbAuthority(strRole));
                }
            }

        }catch(Exception ex){
            logger.debug(ex);
        }

        logger.debug(String.format("%s roles found for %s", (result == null ? 0 : result.size()), username));
    
        return result;           

    }

/*...*/

}


The CsdbAuthority is nothing but an implementation of the GrantedAuthority interface.

So if the bind was successful, then I knew that the user was a legitimate AD user and then I looked in the application's database to see if the user was authorized to use the application.

At this point, you might be asking yourself, wouldn't it have been easier just to use the FilterBasedLdapUserSearch and DefaultLdapAuthoritiesPopulator classes from Spring?  First of all no because I didn't have any roles in LDAP to search for. As mentioned I wanted a quick solution and not one where I would need to do months of paper work to get the roles added to AD; this all took place within a very bureaucratic government agency. Secondly no, because I found the implementation of the FilterBasedLdapUserSearch  confusing, quirky and poorly documented and I figured that rather than messing around for hours trying to configure the class it would be easier just to write my own.


Friday, November 7, 2014

TortoiseCVS - ksh: cvs: not found

While trying to checkout a project from CVS using the ssh protocol with TortouseCVS, I kept getting the following error message:"ksh: cvs: not found".

Setting the environment variables CVS_RSH and CVS_SERVER didn't help. Neither did making PATH changes to my profile on the server side. Then  I found the TortioseCVS setting dialog by right clicking on Windows Explorer as shown below:


Then I found the proper setting shown below in red.


and changed it, which in my case was: "/usr/local/bin/cvs".

This fixed my problem and I could finally checkout my project from cvs.



Sunday, October 12, 2014

My first day with joomla!

Recently, I was asked if I could help someone with joomla. I didn't have any experience with it so I figured the best thing to do was to install it locally on my Windows laptop, which I did using the following instructions from joomla.

XAMMP ( in my case xampp-win32-1.8.3-5-VC11-installer.exe) is a complete, free, easy to install Apache distribution, which contains MySQL, PHP and Perl. Before installing joomla, it was necessary for me to install XAMMP.

Once XAMPP was installed, the control panel could be started from the Start Menu and the individual services need for joomla started by pushing the "Action" Start buttons; shown as Stop buttons below.  

t

Now that the necessary software had been installed and was running, it was time to install the joomla web application, which could be downloaded as a zip from here: http://www.joomla.org/download.html

Next, the downloaded zip file was unpacked into the XAMMP installation folder under the htdocs directory. Because the version of joomla was 334, I decided to install it into a folder called joomla334 under htdocs.



Now, using a browser, the joomla installation was continued by calling: http://localhost/joomla334. This started a wizard, which took me through the installation and ended up as follows:


If you haven't already done so, before starting the installation with the browser as described above, it will be necessary to set up a database. This can be done by using the myphpadmin web console, which should have already been started using the control panel.

http://localhost/phpmyadmin/


(Or just go to the control panel and press the "Admin" button for the MySQL installation.)

Over on the left, in the screen shot above, where is says "Neu", you can initiate the database creation. The name of the database, here "stebdb", and user "stebdba" and password will be needed for the joomla installation. 

It was also necessary to make sure that the joomla application user,  "stebdba", had the permissions to log into the database from a given host. In my case, the host name defaulted to "%", which I guess is a wildcard, but which produced problems. Only after I replaced the "%" with "localhost", was I able to configure the database properly with the joomla installation wizard.

At the end of the installation, I was asked to delete the "installation" folder in the %XAMMP_HOME%/htdocs/joomla334 directory. Only after doing, could I continue to either the joomla administrator application using:

http://localhost/joomla334/administrator/index.php



or to the application's front end:

http://localhost/joomla334/



To make back ups of an existing joomla installation, consider akeeba






Friday, October 10, 2014

Simple pgp encryption with gpg

Simple command line encryption and decryption with gpg

Secret and public key rings are in working directory.

Encrypt

>gpg.exe  -e -r mb-pets@bce.tni  --keyring pubring.gpg encryptme.txt
Use this key anyway?yes

//output called encryptme.txt.gpg

Including private key prevents question about trusting public key.

>gpg.exe -e -r mb-pets@bce.int --homedir . --keyring .\pubring.gpg --secret-keyring secring.gpg encryptme.txt

//output called encryptme.txt.gpg

Output encrpyte file as base64 text.

gpg.exe -e -a -r mb-pets@bce.int --homedir . --keyring pubring.gpg  --secret-keyring secring.gpg encryptme.txt

//output called encryptme.txt.asc

//With verbose
>gpg.exe -e -a -r mb-step@ecb.int --homedir . --verbose --keyring pubring.gpg  --secret-keyring secring.gpg encryptme.txt
gpg: using secondary key 57D35DF1 instead of primary key 23E858FE
gpg: This key belongs to us
gpg: reading from `.\encryptme.txt'
gpg: writing to `.\encryptme.txt.asc'
gpg: ELG-E/AES256 encrypted for: "57D35DF1 Statistics STEP Transfer (STEP Mail 1) <mb-step@ecb.int>"


It's also possbile base64 using openssl; I think.
/usr/sfw/bin/openssl enc -base64 -in signme.txt.gpg  -out signme.txt.b64

An attemp to base64 with powershall looked like this; however, beware because this wasn't tested properly.

[System.Convert]::ToBase64String(([System.Text.Encoding]::UTF8.GetBytes((get-content ".\signme.txt.gpg")))); set-content (".\signme.txt.asc" );

To unbase 64 it:
[System.Convert]::FromBase64String((get-content ".\signme.txt.asc")); set-content (".\out.gpg" );

Decrypt

>gpg.exe --homedir . --decrypt --secret-keyring .\secring.gpg --keyring .\pubring.gpg .\secret.txt.gpg
You need a passphrase to unlock the secret key for **********











Thursday, September 18, 2014

Spring externalized values not resolved in servlet context

If externalized values (@Value(${value.from.properties.file}) exist in a Spring servlet context, then the context will need its own PropertySourcesPlaceholderConfigurer. It's is not enough to have one defined in the parent context.

Values externalized with with the @Value annotation will not be substituted. Instead one will end up with the contents of the annotation's value attribute as the value of the class' String which you are attempting to externalize. In the example below, that would be: ${html.report}

/------------------------------ Spring managed bead ---------------------------------/
@Controller
@RequestMapping("/downloadreport")
public class ReportDownloader {

 @Value("${html.report}") 
 private String reportLocation;

}

Below is the web.xml entries.

/-------------------------------------   web.xml   ------------------------------------/
<!-- application context -->

    <context-param>
        <param-name>contextConfigLocation</param-name>
        <param-value>
            /WEB-INF/applicationContext.xml             
        </param-value>
    </context-param>

    <listener>
        <listener-class>
            org.springframework.web.context.ContextLoaderListener
        </listener-class>
    </listener>

<!-- serlvet context created with java config -->

  <servlet>
      <servlet-name>service</servlet-name>
      <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
      <load-on-startup>1</load-on-startup>
      <init-param>
       <param-name>contextClass</param-name>
       <param-value>
        org.springframework.web.context.support.AnnotationConfigWebApplicationContext
       </param-value>
      </init-param>
      <init-param>

       <param-name>contextConfigLocation</param-name>
       <param-value>
        eu.app.spring.mvc.config.WebMvcConfig
       </param-value>

      </init-param>

  </servlet>


Here is the corresponding java config code.

/-------------------------------------   java config file   ------------------------------------/
@Configuration
@EnableWebMvc
@ComponentScan("eu.app.spring.mvc.handler")
@PropertySource("classpath:web/euapp.properties")
public class WebMvcConfig {

 @Bean
 public PropertySourcesPlaceholderConfigurer placeHolderConfigurer() {
  return new PropertySourcesPlaceholderConfigurer();
 } 

  

 @Bean
 public InternalResourceViewResolver getInternalResourceViewResolver() {

  InternalResourceViewResolver resolver = new InternalResourceViewResolver();
  resolver.setPrefix("/jsp/");
  resolver.setSuffix(".jsp");
  return resolver;   

 }

}



Monday, July 21, 2014

Creating Java libraries from DLLs for Concept2 ergometer

I have a Concept2 rower with a PM4, which has a SDK, which can be downloaded from the Concept2 website. The SDK provides some DLLs, which can be used to access the rower's PM4 computer. This blog describes how I generated some Java libraries using gluegen and Microsoft Visual Studio 2010 to access to DLLs with Java.

Because the SDK provides the C header files for its DLLs, I was able to give these as input to the gluegen program and automatically generate the Java and JNI code necessary to call Concept2 libraries. Below is an excerpt from the generated JNI code for the tkcmdsetCSAFE_command function. 


/*   Java->C glue code:
 *   Java package: org.wmmnpr.c2.csafe.PM3CsafeCP
 *    Java method: short tkcmdsetCSAFE_command(short unit_address, short cmd_data_size, java.nio.LongBuffer cmd_data, java.nio.ShortBuffer rsp_data_size, java.nio.LongBuffer rsp_data)
 *     C function: ERRCODE_T tkcmdsetCSAFE_command(UINT16_T unit_address, UINT16_T cmd_data_size, UINT32_T *  cmd_data, UINT16_T *  rsp_data_size, UINT32_T *  rsp_data);
 */
JNIEXPORT jshort JNICALL 

Java_org_wmmnpr_c2_csafe_PM3CsafeCP_tkcmdsetCSAFE_1command1__SSLjava_lang_Object_2IZLjava_lang_Object_2IZLjava_lang_Object_2IZ(JNIEnv *env, jclass _unused, jshort unit_address, jshort cmd_data_size, jobject cmd_data, jint cmd_data_byte_offset, jboolean cmd_data_is_nio, jobject rsp_data_size, jint rsp_data_size_byte_offset, jboolean rsp_data_size_is_nio, jobject rsp_data, jint rsp_data_byte_offset, jboolean rsp_data_is_nio) {
  UINT32_T * _cmd_data_ptr = NULL;
  UINT16_T * _rsp_data_size_ptr = NULL;
  UINT32_T * _rsp_data_ptr = NULL;

  ERRCODE_T _res;

  if ( NULL != cmd_data ) {
    _cmd_data_ptr = (UINT32_T *) ( JNI_TRUE == cmd_data_is_nio ?  (*env)->GetDirectBufferAddress(env, cmd_data) :  (*env)->GetPrimitiveArrayCritical(env, cmd_data, NULL) );  }
  if ( NULL != rsp_data_size ) {
    _rsp_data_size_ptr = (UINT16_T *) ( JNI_TRUE == rsp_data_size_is_nio ?  (*env)->GetDirectBufferAddress(env, rsp_data_size) :  (*env)->GetPrimitiveArrayCritical(env, rsp_data_size, NULL) );  }
  if ( NULL != rsp_data ) {
    _rsp_data_ptr = (UINT32_T *) ( JNI_TRUE == rsp_data_is_nio ?  (*env)->GetDirectBufferAddress(env, rsp_data) :  (*env)->GetPrimitiveArrayCritical(env, rsp_data, NULL) );  }
  _res = tkcmdsetCSAFE_command((UINT16_T) unit_address, (UINT16_T) cmd_data_size, (UINT32_T *) (((char *) _cmd_data_ptr) + cmd_data_byte_offset), (UINT16_T *) (((char *) _rsp_data_size_ptr) + rsp_data_size_byte_offset), (UINT32_T *) (((char *) _rsp_data_ptr) + rsp_data_byte_offset));

  if ( JNI_FALSE == cmd_data_is_nio && NULL != cmd_data ) {
    (*env)->ReleasePrimitiveArrayCritical(env, cmd_data, _cmd_data_ptr, 0);  }
  if ( JNI_FALSE == rsp_data_size_is_nio && NULL != rsp_data_size ) {
    (*env)->ReleasePrimitiveArrayCritical(env, rsp_data_size, _rsp_data_size_ptr, 0);  }
  if ( JNI_FALSE == rsp_data_is_nio && NULL != rsp_data ) {
    (*env)->ReleasePrimitiveArrayCritical(env, rsp_data, _rsp_data_ptr, 0);  }
  return _res;

}

The first difficulty experienced was that the DLLS, even though they were written in C, were compiled as C++ code, which meant that when the JNI generated code was compiled as C files, it could not be linked against the C++ static libraries (PM3CsafeCP.lib, PM3DDICP.lib and PM3USBCP.lib); the external references, namely the functions I wanted to call in the DLLs, were mangled according to C++ rules rather than C ones, which my C object files expected. The link error was as follows:

error LNK2019: unresolved external symbol "__imp__tkcmdsetCSAFE_async_command" ...

The only option was to compile the gluegen generated JNI C files are C++ files; however, that wouldn't work because the "__cplusplus" macros in the jni.h header file lead to compile errors; namely:

error C2819: type 'JNIEnv_' does not have an overloaded member 'operator ->'

I fixed this by commenting out the "__cplusplus" macros and their corresponding code in a local copy of the jni.h file so that the struct JNIEnv_ definition would stay the same and not change for C++ compilation modules. The changes allowed me to compile my C code with the C++ compiler without any errors and thereafter link against the PM4 C++ DLLs; which were really C ones. 

The next error occurred during runtime. Loading the DLLs made from the gluegen code with "System.loadLibrary
" was no problem given that the DLLs created with the code from gluegen and the ones from the SDK (PM3CsafeCP.dll, PM3DDICP.dll and PM3USBCP.dll) were located in the in the JVM's search path. My solution was to set the VM argument "java.library.path" to the proper location of the DLLs, which I had placed in one directory.

The first indication that something was wrong with my DLLs occurred when trying to access one of the exported functions; in which case, the following error occurred:

Exception in thread "main" java.lang.UnsatisfiedLinkError: org.wmmnpr.c2.ddi.PM3DDICP.tkcmdsetDDI_init()S
at org.wmmnpr.c2.ddi.PM3DDICP.tkcmdsetDDI_init(Native Method)

Again a problem with the name mangling. Because I had compiled my generated gluegen C source files as C++ source files into DLLs, the Java runtime could find them under the signatures generated by gluegen. The solution was the to make sure the the exported functions in the C modules were mangled using C mangling rules. This I did by enclosing my source code in 

extern "C" {
/* code from gluegen */
}

Enclosing the header file declarations would have been better but gluegen didn't produce any header files; only source ones.


With those three errors solved I was able to write a Java program to access my PM4.