Monday, December 2, 2013

Removing invalid characters from XML

XML as you would know essentially consists of markup tags and character data. The markup tags are > (greater than), < (less than), ' (single quote), " (double quote) and & (ampersand). Character data which appears inside text nodes or in attributes could be anything, any character in any language.
But not all Unicode characters are fit to be included in XML as character data. There are two specs that one needs to refer to understand this
  1. http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char
  2. http://www.w3.org/TR/2000/REC-xml-20001006#syntax
The following Java code is an implementation of these rules. It essentially removes all these illegal Unicode characters.

private static String removeInvalidXMLCharacters(String xmlString) {
    StringBuilder out = new StringBuilder();
    int codePoint;
    int i = 0;
    while (i < xmlString.length())
    {
        // This is the unicode code of the character.
        codePoint = xmlString.codePointAt(i);
        if ((codePoint == 0x9) ||
                (codePoint == 0xA) ||
                (codePoint == 0xD) ||
                ((codePoint >= 0x20) && (codePoint <= 0xD7FF)) ||
                ((codePoint >= 0xE000) && (codePoint <= 0xFFFD)) ||
                ((codePoint >= 0x10000) && (codePoint <= 0x10FFFF)))
        {
            out.append(Character.toChars(codePoint));
        }
        i += Character.charCount(codePoint);
    }
    return out.toString();
}

Tuesday, November 26, 2013

Get day diferrence between two dates using JAVA

return (int)( (date2.getTime() - date.getTime()) / (1000 * 60 * 60 * 24));

First day of next month with java Time


public static Date getNextMonth() {
    Date current = new Date();
    Calendar c = Calendar.getInstance();
    c.setTime(current);
    c.add(Calendar.MONTH, 1);
    c.set(Calendar.DATE, 1);

    c.set(Calendar.HOUR, 0);
    c.set(Calendar.MINUTE, 0);
    c.set(Calendar.SECOND, 0);
    return c.getTime();
}

Date nextMonth = getNextMonth();
System.out.println(nextMonth);

And output as if you check it withing november month.

Sun Dec 01 00:00:00 ALMT 2013

Saturday, November 23, 2013

java convert regular expression into string

It is easy to check a string with regular expression, but there is no direct method to generate string from regular expression. I hardly need it but found nothing. So I myself generated the following program to generate string using regular expression and check back the string with given regular expression. The program i generated will sufficient for normal expression. It will cover my back. It will not parse high level expression.

Output example using random regex generator


Regex: [a-z][a-z0-9]{8,10}@[a-z0-9]{5,9}.com
Matches: [a-z][a-z0-9]{8,10}@[a-z0-9]{5,9}.com, String: bak65qvkpp@3qosue.com, Length: 21



Regex: [\w][\D]{0,10}[\d\W]{1,10}[abc.][abc09]{1,10}
Matches: [\w][\D]*[\d\W]+[abc.][abc09]+, String: aLchJL,e’‘.c9bca, Length: 16



Regex: [A-Z]{4}-#%[\d]{3}
Matches: [A-Z]{4}-#%[\d]{3}, String: BAKZ-#%114, Length: 10



Regex: www.[a-z][a-z0-9-_]{5,15}[a-z].com
Matches: www.[a-z][a-z0-9-_]{5,15}[a-z].com, String: www.jj829x10cq9ngilk.com, Length: 24



Regex: 01[7856][0-9]{2}-[0-9]{6}
Matches: 01[7856][0-9]{2}-[0-9]{6}, String: 01700-736202, Length: 12

StringRegex.java (main class)


import java.util.Random;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 *
 * @author Pritom K Mondal
 * @version 1.0
 */
public class StringRegex {
    public static void main(String[] args) {
        String regex = "[a-z][a-z0-9]{8,10}@[a-z0-9]{5,9}.com";
        StringRegex.stringFromRegex(regex);
        regex = "[\\w][\\D]*[\\d\\W]+[abc.][abc09]+";
        StringRegex.stringFromRegex(regex);
        regex = "[A-Z]{4}-#%[\\d]{3}";
        StringRegex.stringFromRegex(regex);
        regex = "www.[a-z][a-z0-9-_]{5,15}[a-z].com";
        StringRegex.stringFromRegex(regex);
        regex = "01[7856][0-9]{2}-[0-9]{6}";
        StringRegex.stringFromRegex(regex);
    }
    
    public static void stringFromRegex(String regex) {         
        String regexConverted = cvtLineTerminators(regex);
        System.out.println("Regex: " + regexConverted);
        String fullString = "";
        fullString = parse(regexConverted, "");
        if(fullString.matches(regex)) {
            System.out.println("Matches: " + regex + ", String: " + fullString + ", Length: " + fullString.length());
        } else {
            System.err.println("Matches Failed: " + regex + ", String: " + fullString + ", :Length: " + fullString.length());
        }
        System.out.println("\n\n");
    }
    
    private static String parse(String regex, String fullString) {
        Random random = new Random();
        if(regex.trim().length() > 0) {
            Boolean allow = false, processed = false;
            if(regex.startsWith("[") 
                    && (regex.substring(0, regex.indexOf("]") + 1).matches("(.*)[a-z]\\-[a-z](.*)") 
                    || regex.substring(0, regex.indexOf("]") + 1).matches("(.*)[A-Z]\\-[A-Z](.*)") 
                    || regex.substring(0, regex.indexOf("]") + 1).matches("(.*)d(.*)") 
                    || regex.substring(0, regex.indexOf("]") + 1).matches("(.*)D(.*)") 
                    || regex.substring(0, regex.indexOf("]") + 1).matches("(.*)w(.*)") 
                    || regex.substring(0, regex.indexOf("]") + 1).matches("(.*)W(.*)") 
                    || regex.substring(0, regex.indexOf("]") + 1).matches("\\[([a-z]*)([0-9]*).\\]") 
                    || regex.substring(0, regex.indexOf("]") + 1).matches("\\[[A-Z].\\]") 
                    || regex.substring(0, regex.indexOf("]") + 1).matches("(.*)[0-9]\\-[0-9](.*)"))) {
                allow = true;
            }
            if(allow) {
                int start = regex.indexOf("[");
                int end = regex.indexOf("]", start);
                String part = regex.substring(start, end + 1);
                regex = regex.substring(end + 1);
                int pos1 = 1, pos2 = 1;
                if(regex.startsWith("{")) {
                    start = 0;
                    String pos = "";
                    end = regex.indexOf("}", start);
                    pos = regex.substring(start + 1, end);
                    //part = part + regex.substring(start, end + 1);
                    regex = regex.substring(end + 1);
                    
                    if(pos.trim().length() > 0) {
                        if(pos.contains(",")) {
                            String[] poss = pos.split(",");
                            pos1 = Integer.parseInt(poss[0]);
                            pos2 = Integer.parseInt(poss[1]);
                        } else {
                            pos1 = Integer.parseInt(pos);
                            pos2 = Integer.parseInt(pos);
                        }
                    }
                }
                //System.out.println("Pos: " + pos1 + " to " + pos2);
                
                StringBuilder sb = new StringBuilder();                
                String str = "";
                String printInfo = "Processing-1: " + part;
                int tried = 0;
                while(sb.toString().length() == 0 && tried <= 5) {
                    tried++;
                    str = RandomString.nextString(2000);
                    try {
                        Matcher m = Pattern.compile(part).matcher(str);
                        while (m.find()) {
                            sb.append(m.group(0).toString());
                        }
                    } catch (Exception ex) {
                        System.out.println("Exception: " + ex.getMessage());
                    }
                }
                if(sb.toString().length() > 0) {
                    processed = true;
                    str = sb.toString();
                    while(true) {
                        if(str.length() < pos2) {
                            str = str.concat(str);
                        } else {
                            break;
                        }
                    }
                    printInfo += " = (" + pos1 + ", " + pos2 + "): ";
                    String cutString = str.substring(0, pos1);
                    if(pos2 - pos1 > 0) {
                        pos2 = random.nextInt(pos2 - pos1);
                        cutString += str.substring(pos1, pos2 + pos1);
                    }
                    printInfo += cutString;
                    //System.out.println(printInfo);
                    fullString += cutString;
                } else {
                    regex = part + regex;
                }
            } 
            if(regex.startsWith("[") && !processed) {
                processed = true;
                int start = regex.indexOf("[");
                int end = regex.indexOf("]", start);
                String part = regex.substring(start + 1, end);
                regex = regex.substring(end + 1);
                String printInfo = "Processing-2: " + part;
                int pos1 = 1, pos2 = 1;
                if(regex.startsWith("{")) {
                    start = 0;
                    String pos = "";
                    end = regex.indexOf("}", start);
                    pos = regex.substring(start + 1, end);
                    regex = regex.substring(end + 1);
                    
                    if(pos.trim().length() > 0) {
                        if(pos.contains(",")) {
                            String[] poss = pos.split(",");
                            pos1 = Integer.parseInt(poss[0]);
                            pos2 = Integer.parseInt(poss[1]);
                        } else {
                            pos1 = Integer.parseInt(pos);
                            pos2 = Integer.parseInt(pos);
                        }
                    }
                }
                String pushString = "";
                for(int i = 0; i < pos1; i++) {
                    pushString += part;
                }
                if(pos2 - pos1 > 0) {
                    pos2 = random.nextInt(pos2 - pos1);
                    for(int i = 0; i < pos2; i++) {
                        pushString += part;
                    }
                }
                printInfo += "{" + pos1 + "," + pos2 + "} = " + pushString;
                //System.out.println(printInfo);
                fullString += pushString;
            } 
            if(!processed) {
                //System.out.println("Handled: " + regex);
                regex = "[" + regex;
                int fa = regex.indexOf("]"), f2 = regex.indexOf("{"), f3 = regex.indexOf("[", 1);
                if(fa > f3) fa = f3;
                if(fa > f2) fa = f2;
                if(fa < 0) {
                    if(f2 < f3 && f2 >= 0) {
                        fa = f2;
                    } else if(f3 > f2 && f3 >= 0) {
                        fa = f3;
                    } else {
                        fa = regex.length();
                    }
                }
                //regex = regex.substring(0, fa) + "]" + regex.substring(fa);
                regex = regex.substring(0, 2) + "]" + regex.substring(2);
                //System.out.println("Handled: " + regex);
            }
            fullString = parse(regex, fullString);
        }
        return fullString;
    }
    
    private static String cvtLineTerminators (String s) {
        s = s.replaceAll("\\*", "{0,10}");
        s = s.replaceAll("\\+", "{1,10}");
        
        StringBuffer sb = new StringBuffer ();
        int oldindex = 0, newindex;
        while ((newindex = s.indexOf ("\\n", oldindex)) != -1) {
            sb.append (s.substring (oldindex, newindex));
            oldindex = newindex + 2;
            sb.append ('\n');
        }
        sb.append (s.substring (oldindex));

        s = sb.toString ();
        sb = new StringBuffer ();
        oldindex = 0;
        while ((newindex = s.indexOf ("\\r", oldindex)) != -1) {
            sb.append (s.substring (oldindex, newindex));
            oldindex = newindex + 2;
            sb.append ('\r');
        }
        sb.append (s.substring (oldindex));

        s = sb.toString ();
        sb = new StringBuffer ();
        oldindex = 0;
        while ((newindex = s.indexOf ("\\s", oldindex)) != -1) {
            sb.append (s.substring (oldindex, newindex));
            oldindex = newindex + 2;
            sb.append (" ");
        }
        sb.append (s.substring (oldindex));
        
        return sb.toString();
   }
}

RandomString.java


import java.util.Random;

/**
 *
 * @author Pritom K Mondal
 */
public class RandomString {
    private static final char[] symbols = new char[10 + 26 + 26 + 3 + 8];
    private static final char[] chars = new char[52];
    private static final char[] numbers = new char[10];
    
    static {
        for (int idx = 0; idx < 10; ++idx) {
            numbers[idx] = (char) ('0' + idx);
        }
        for (int idx = 10; idx < 36; ++idx) {
            chars[idx - 10] = (char) ('a' + idx - 10);
        }
        for (int idx = 36; idx < 62; ++idx) {
            chars[idx - 10] = (char) ('A' + idx - 36);
        }
        /**
         * String
         */
        int total = 0;
        for(int idx = 48; idx <= 57; idx++, total++) {
            symbols[total] = (char) idx;
        }// 10 (10)
        for(int idx = 65; idx <= 90; idx++, total++) {
            symbols[total] = (char) idx;
        } // 26 (36)
        for(int idx = 97; idx <= 122; idx++, total++) {
            symbols[total] = (char) idx;
        } // 26 (62)
        for(int idx = 44; idx <= 46; idx++, total++) {
            symbols[total] = (char) idx;
        } // 3 (65)
        int[] copyFrom = { 64, 95, 44, 46, 45, 145, 146, 35 };
                         //@   _   ,   .   -   ‘     ’    #
        for(int idx = 0; idx < copyFrom.length; idx++, total++) {
            symbols[total] = (char) copyFrom[idx];
        } // 8 (73)
        //System.out.println(total);
    }
    
    private static Random random = new Random();
    private static char[] buf;
    
    public static String fullString() {
        return new String(symbols);
    }
    
    public static String nextString(int length) {
        if (length < 1) {
            throw new IllegalArgumentException("length < 1: " + length);
        }
        buf = new char[length];
        //System.out.println(symbols);
        for (int idx = 0; idx < buf.length; ++idx) {
            buf[idx] = symbols[random.nextInt(symbols.length)];
        }
        return new String(buf);
    }
    
    public static String nextNumber(int length) {
        if (length < 1) {
            throw new IllegalArgumentException("length < 1: " + length);
        }
        buf = new char[length];
        for (int idx = 0; idx < buf.length; ++idx) {
            buf[idx] = numbers[random.nextInt(numbers.length)];
        }
        return new String(buf);
    }
    
    public static String nextCharacter(int length) {
        if (length < 1) {
            throw new IllegalArgumentException("length < 1: " + length);
        }
        buf = new char[length];
        for (int idx = 0; idx < buf.length; ++idx) {
            buf[idx] = chars[random.nextInt(chars.length)];
        }
        return new String(buf);
    }
}

Friday, November 22, 2013

Java generate sequence of random characters or numbers or strings

import java.util.Random;

/**
 * Created by Pritom K Mondal.
 */
public class RandomString {
    public static String formatAsLength(Long number, Integer minLength = 0) {
        return String.format("%0${minLength}d%n".toString(), number).trim();
    }

    private static final char[] symbols = new char[62];
    private static final char[] chars = new char[52];
    private static final char[] numbers = new char[10];

    static {
        initialize();
    }

    public static void initialize() {
        Integer symIdx = 0, numIdx = 0, charIdx = 0;
        for (int idx = 48; idx <= 57; ++idx) {
            symbols[symIdx++] = (char) (idx);
            numbers[numIdx++] = (char) (idx);
        }
        for (int idx = 97; idx < 26 + 97; ++idx) {
            symbols[symIdx++] = (char) (idx);
            chars[charIdx++] = (char) (idx);
        }
        for (int idx = 65; idx < 26 + 65; ++idx) {
            symbols[symIdx++] = (char) (idx);
            chars[charIdx++] = (char) (idx);
        }
    }

    private static Random random = new Random();

    public static String nextString(int length) {
        if (length < 1) {
            throw new IllegalArgumentException("length < 1: " + length);
        }
        if (symbols[0] == (char) 0) {
            initialize();
        }
        String returnString = "";
        for (int idx = 0; idx < length; ++idx) {
            returnString += symbols[random.nextInt(symbols.length)].toString()
        }
        return returnString;
    }

    public static String nextNumber(int length) {
        if (length < 1) {
            throw new IllegalArgumentException("length < 1: " + length);
        }
        if (symbols[0] == (char) 0) {
            initialize();
        }
        char[] buf = new char[length];
        for (int idx = 0; idx < buf.length; ++idx) {
            buf[idx] = numbers[random.nextInt(numbers.length)];
        }
        return new String(buf);
    }

    public static String nextCharacter(int length) {
        if (length < 1) {
            throw new IllegalArgumentException("length < 1: " + length);
        }
        if (symbols[0] == (char) 0) {
            initialize();
        }
        char[] buf = new char[length];
        for (int idx = 0; idx < buf.length; ++idx) {
            buf[idx] = chars[random.nextInt(chars.length)];
        }
        return new String(buf);
    }
}

Usage:


String mixing = RandomString.nextString(40);
String number = RandomString.nextNumber(40);
String chars = RandomString.nextCharacter(40);
System.out.println("MIXING: " + mixing);
System.out.println("NUMBER: " + number);
System.out.println("CHARS:  " + chars);

Output:


MIXING: ZP8RQCpEF7Kr9veJDlbUWafqNxjqh4xwJr3bvw9J
NUMBER: 5578864807912590082652398553220665093932
CHARS:  nZvejdXyiMEzzbeuJyVTpaGwmGykqNVMQLAckAfP