JavaDevNotes.com

Java String Split Tutorial And Examples

This tutorial will explain how to use the Java String's Split method. Examples will also be shown for quick reference.

Description

The Java String's split method splits a String given the delimiter that separates each element. The returned value is an array of String.

Here is the syntax:

public String[] split(String regex)

For example, if we have a String that contains animals separated by comma:

String sampleString = "Cat,Dog,Elephant";

When we use the Split method, we can retrieve the array that contains each individual animal. Here is the sample code:

public static void main(String[] args) {
   String sampleString = "Cat,Dog,Elephant";
   String[] animals = sampleString.split(",");
   System.out.println("The number of animals is: " + animals.length);
   for (String animal : animals) {
      System.out.println(animal);
   }
}

The returned array will contain 3 elements. These are: "Cat", "Dog" and "Elephant". The output of the code above is this:

The number of animals is: 3
Cat
Dog
Elephant

Limit

We can also specify the maximum number of items in the returned String Array of the split() method. Here is the syntax:

public String[] split(String regex, int limit)

Limit less than number of items

As an example, suppose we have 4 items in a comma separated String and we pass the value of 2 for the limit parameter:

String[] shapes = "Circle,Square,Rectangle,Hexagon".split(",", 2);

The Java String split method will only return 2 items, even when there are 4 possible elements. Here is the equivalent String array:
String[] shapes = {"Circle", "Square,Rectangle,Hexagon"};
Notice that after the first occurrence of comma, the remaining characters were concatenated together as the second element of the resulting String array.
If we pass the value of 3 for the limit parameter:
String[] shapes = "Circle,Square,Rectangle,Hexagon".split(",", 3);
This will be the equivalent resulting value.
String[] shapes = {"Circle", "Square", "Rectangle,Hexagon"};
Notice again the value of the last element.

Limit more than number of items

However, if we pass a limit value that is more than the number of possible elements, the result is the same as not passing a limit parameter at all. For example, this code:

String[] shapes = "Circle,Square,Rectangle,Hexagon".split(",", 10);

Is equivalent to
String[] shapes = {"Circle", "Square", "Rectangle", "Hexagon"};

Regular expression

The delimiter passed to the split() method is treated as a regular expression. For example, the regular expression that represents one or more numeric digits is this:
String digitsRegrex = "\\d+";

Numeric delimiter

Here is an example code where we split a String given that the delimiter is a series of numerical digits:

String[] items = "abc123def456ghi789".split("\\d+");

This is equivalent to:

String[] items = {"abc", "def", "ghi"};

Three items we returned because the digits in the String were treated as delimiters.

Alphabet delimiter

Here is the regular expression for a sequence of one or more alphabet letters.

String digitsRegrex = "[a-zA-Z]+";

Here is an example code that uses this regular expression:

String[] items = "123abc456def789".split("[a-zA-Z]+");

The equivalent to:

String[] items = {"123", "456", "789"};

Three items we returned because the letters in the String were treated as delimiters.

Pitfalls

Here are the things that you should watch out for when working with Java String's split method.

Special Characters

Since the parameter for the delimiter is a regular expression, it should be noted that some characters have special meaning. For example, we can't use "." and "|" as is.

  • Using "."
    public static void main(String[] args) {
       String[] items = "A.B.C".split(".");
       System.out.println("Number of items is: " + items.length);
    }
    
    Will output:
    Number of items is: 0
    
  • Using "|"
    public static void main(String[] args) {
       String[] items = "A|B|C".split("|");
       System.out.println("Number of items is: " + items.length);
    }
    
    Will output:
    Number of items is: 6
    

Output for both is unexpected. Our intent is to have 3 items equivalent to "A", "B" and "C".

The proper way is to escape the special characters with "\\".

  • Using "\\."
    String[] items = "A.B.C".split("\\.");
    
    Is equivalent to
    String[] items = {"A", "B", "C"};
    
  • Using "\\|"
    String[] items = "A|B|C".split("\\|");
    
    Is also equivalent to
    String[] items = {"A", "B", "C"};
    

Empty Strings

There are cases when elements in the returned String array have blank Strings.
  • Prefix - when the String starts with the delimiter, the first element becomes empty String.
    String[] items = ",A,B,C".split(",");
    
    I s equivalent to
    String[] items = { "", "A", "B", "C" };
    
  • Middle - each additional occurence of the delimiter in the middle of the String will result to a corresponding empty String
    String[] items = ",A,B,,,C".split(",");
    
    I s equivalent to
    String[] items = { "", "A", "B", "", "", "C" };
    
  • Suffix - all occurence of the delimiter at the end of the String will be ignored.
    String[] items = "A,B,C,,,,,,,,,,".split(",");
    
    I s equivalent to
    String[] items = { "A", "B", "C" };
    

    However, we can force split to count all extra delimiters at the end of the String by passing -1 as limit:

    String[] items = "A,B,C,,,".split(",", -1);
    
    I s equivalent to
    String[] items = { "A", "B", "C", "", "", "" };
    


String Tutorials And Examples