Translate

2021-04-08

Learning Rust. Part 12. Split strings to chars or words, getting a char/word, searching text. 16 examples.



Contents:
1) Something on the difference between str and String
2) Splitting a line into chars; print!("{}", chr);
3) Getting a certain character
4) Splitting a line into words; split_whitespace()
5) Splitting on the n:th occurrence of x
6) Splitting on a character of your choice
7) 
Check if palindrome
8) Linear search for text or a piece of text
9) Concatenate strings
10) Check if n:th char from console input is equal to X
11) Reading a line from keyboard and convert to number
12)
 Read one (1) character from the keyboard
13) Processing string by string from a text
14) Testing a string to check it's a number
15) Comparing usize to & str for equality
16) Converting number to a string


Something on the difference between str and String

a String has three components:
the text
a method, "capacity" => how much memory is allocated for it
a method, "len" => the length of the text

Usage:
fn main() {
    let s = "Monday Tuesday Wdenesday Thursday".to_string();
    println!("{}", s.capacity());
    println!("{}", s.len());
}
cargo run
33
33

a str has two components:
the text:
a method, "len" => the length of the text

Usage:
fn main() {
    let s = "Monday Tuesday Wdenesday Thursday";
    println!("{}", s.capacity());
    println!("{}", s.len());
}
cargo run
error[E0599]: no method named `capacity` found for reference `&str` in the current scope
--> src/main.rs:3:18
 |
3 | println!("{}", s.capacity());
 |                  ^^^^^^^^ method not found in `&str`

Referencing a String will "coerce" it into a &str so if both sides are Strings, referring to the right hand 
side string like &String (e.g. "&my_string" or "&whatever_string" etc.) it will be used as a &str.
See examples under the "
Concatenate strings" section, below.

Note: println! and format! do the same thing but format! will create a formatted String as result, not writing to the console.

fn main() {
    let s = "Very long ...  ....string";
    let str2char_in_vec : Vec<char> = s.to_string().chars().collect();
    for chr in str2char_in_vec {
        print!("{}", chr);
    }
}

cargo run:
Very long ... ...string


A byline: Vec and String both have the same methods, "capacity" and "len".
Both have preallocated memory size: let s = String::with_capacity(n); 
, let v = Vec::with_capaicty(n). If you're concerned with speed, preallocate and don't add more elements
than the existing capacity, then they will not have to be re-allocated in memory (= takes time) ; 

Splitting a line into chars; println!("{}", chr); 
fn main() {
    let s = "Very long ... string";
    let str2char_in_vec : Vec<char> = s.to_string().chars().collect();
    for chr in str2char_in_vec {
        println!("{}", chr)
} }
cargo run:
V
e
r

.
.
.

r
i
n
g

Getting a certain character:
fn main() {
    let s = "Very long ... string";
    let str2char_in_vec : Vec<char> = s.to_string().chars().collect();

    let ix = 2;
    if  ix < str2char_in_vec.len(){
        println!("{}", str2char_in_vec[ix]);
    }else{
        println!("{}", "Index past end of vector!");
    }

}
cargo run:
r    

Splitting a line into words; split_whitespace()
fn main() {
    let l = "Very long ... string";
    let str2words_in_vec : Vec<&str> = l.split_whitespace().collect();
    for s in str2words_in_vec {
        print!("{} - ", s);
    }
}
cargo run:
Very - long - ... - string 

Splitting line on n:t occurence of x
fn main() {
    let (a,b) = split_once("hello:world:earth");    // the line that will be split into two parts
println!("{} {}", a, b) } fn split_once (in_string: &str) -> (&str, &str) { let mut splitter = in_string.splitn(2, ':'); //    println!("{:?}",splitter);                        // out of my curiosity... let first = splitter.next().unwrap(); let second = splitter.next().unwrap(); //    
let third = splitter.next().unwrap();        // testing splitting in three, see expl. below (first,second) }
[" Just because I was curious about what the contents of "splitter" is, I added a 
println! this is it:
SplitN (SplitNInternal { 
        iter: SplitInternal { 
            start: 0, end: 17, matcher: 
            CharSearcher { 
                haystack: "hello:world:earth", finger: 0, finger_back: 17, needle:':', utf8_size: 1, utf8_encoded: [58, 0, 0, 0] 
            }, allow_trailing_empty: true, finished: false 
        }, count: 3 
       }
      )  "]
Cargo run hello world
Changing string.splitn(2,':') to string.splitn(3,':')
and adding a "let third = splitter.next().unwrap();" and changing "(first,second)" to (first,third)
I get:
Cargo run
hello earth

Splitting on a character of your choice
There are lots of example on how to split strings on page
https://doc.rust-lang.org/std/string/struct.String.html Some splitting methods may force you to create a str out of a String before split:ing. This ought to be
easy by referring to "your_string" as "&your_string" and put it in the place of the text in the examples. Like this:
fn main(){

let s = "hello, this isx a testxx split".to_string();  // s is now String. Let's say you got this from somewhere
// and now you want to split it.
// first you want to convert it to a slice. Referring to a String will create a str in
// the process
let s_slice: &str = &*s;    // means: a pointer a string & give me what's in it! All of it!
// the slice is now of the type str and you can insert it into the split statement:
let v: Vec<&str> = s_slice.split('x').collect();
println!("{:?}",v );
}
cargo run
["hello, this is", " a test", "", " split"]



Check if palindrome
pub fn check_palindrome(input: &str) -> bool {
    if input.len() == 0 {
        return true;
    }
    let mut last = input.len() - 1;
    let mut first = 0;

    let my_vec = input.as_bytes().to_owned();

    while first < last {
        if my_vec[first] != my_vec[last] {
            return false;
        }

        first +=1;
        last -=1;
    }
    return true;
}
From https://www.linuxjournal.com/content/text-processing-rust!


Linear search for text or a piece of text
pub fn linear_search<T: PartialEq>(item: &T, arr: &[T]) -> i32 {
    let mut idx_pos = -1; // -1 indicates not found

    for (idx, data) in arr.iter().enumerate() {
        if item == data {
            idx_pos = idx as i32;
            return idx_pos;
        }
    }
    idx_pos
}
fn main() {
    let index = linear_search(&"Rust", &vec!["Python", "Php", "Java", "C", "C++", "Rust"]);
    println!("Position: {}", index);

    let index = linear_search(&25, &vec![25, 62, 29, 43, 77]);
    println!("Position: {}", index);

    let index = linear_search(&855, &vec![25, 62, 29, 43, 77]);
    println!("Position: {}", index);
}
From: https://www.hackertouch.com/linear-search-in-rust.html
Note: Binary search is an algorithmic:ally efficient way of searching an ordered array. That means that 
you'd be forced to sort the text before searching. This is not normally what you want since you'd totally
 lose the context. However, binary search is much faster so if context isn't an issue, read about this 
method. I won't use it in near time, so examples won't appear here.

Concatenate strings
+ is an implementation of Add, a  trait implementation
This means that it only works if:
    the left hand side is a String
and     the right hand side is [or is coercible to} a str
This works;
fn main() {
    let s1 = "abc".to_string();
    let s2 = s1 + "def";
println!("{}", s2);
}
cargo run
abcdef

This works:
fn main() {
    let s1 = "abc".to_string();
    let s2 = "def".to_string();
    let s3 = s1 + & s2;
    println!("{}", s3);
}
cargo run
abcdef
This works:

    s = "My string".to_string();
    let prefix:String = "\n/* ".to_string();
    let suffix:String = "\n".to_string();
    s = prefix + &s + &suffix;
    // s will be "\n/*My string*/" The type is "String"
Check if n:th char from console input is equal to X

        s = "Hi ho and a bottle of rum".to_string();
        let n1 = 6;
        let ch1 = s.chars().nth(n1).unwrap();                // ch1 will be = 'a'

 Reading a line from keyword and convert to number   
    let mut lnum_try = String::new();
    std::io::stdin().read_line(&mut lnum_try);
    lnum_try = lnum_try.trim().to_string();          // read_line is a &str ending wich a "/n" 
                                                                                 //  Must use trim()  !  
    let n = lnum_try.parse::<u32>().unwrap();     // u32 max: 4 294 967 296
                                                                                 // u64 max: 18 446 744 073 709 551 616     
    return n;

Reading one (1) character from keyboard
 ASCII only decimal 0 - 127 !

In .../project(<your project>/Cargo.toml :
[package]
name = "<your project>"
version = "0.1.0" authors = ["root"] edition = "2018" # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html [dependencies] termios = "0.3.3"
  // add this. Hovering with the mouse pointer over the crate name on page                                //
https://crates.io/search?q=termios you get a symbol. Click on it.                               // Do "Ctrl-v" below [dependencies] and the correct reference will be added to it.                                // Save to disk.
When you later do "Cargo run", the crate will be fetched from the repository. Termios is then installed.
Add to your program:
// This is the Termios crate 
fn termios() ->  [u8;1]{                                              // is an ARRAY consisting of 1 u8 !
    let stdin = 0;                                                          // couldn't get std::os::unix::io::FromRawFd to work
                                                                                   // on /dev/stdin or /dev/tty [fm the author]
    let termios = Termios::from_fd(stdin).unwrap();
    let mut new_termios = termios.clone();                     // make a mutable copy of termios 
                                                                                           // that we will modify
    new_termios.c_lflag &= !(ICANON | ECHO);         // no echo and canonical mode
    tcsetattr(stdin, TCSANOW, &mut new_termios).unwrap();
    let stdout = io::stdout();
    let mut reader = io::stdin();
    let mut buffer = [0;1];                                                  // read exactly one byte [into this ARRAY!]
    stdout.lock().flush().unwrap();
    reader.read_exact(&mut buffer).unwrap();
    let c = buffer;
    
    tcsetattr(stdin, TCSANOW, & termios).unwrap();  // reset the stdin to 
                                                    // original termios data
            return c;
}
---
To use "c" I do:
        s = String::new();
        let mut raw = termios();

        let  mybyte = str::from_utf8(&raw).unwrap();
        s.push_str(&mybyte);                            // push it onto a String, at the end
        let n1 = s.len()-1;                                                        
        let ch1 = s.chars().nth(n1).unwrap(); // finding the last char in the string
                                     // you must also handle the case where there is no char!
        print!("{}",ch1);
        io::stdout().flush().unwrap();

        s

---

To compare ch1 to check for something:
 if (ch1 == 'b')    // or 'c' or 'd' or whatever

To see the decimal number representation for ch1 do print!("{:?}",ch1);

To see the character do print!("{}",ch1);


Processing string by string from a text

Let's say you have a NewLine delimited text (= normal text) and let's call it 'cont', for 'context'

    let s_slice: &str = &*cont;                                         // creating a 'slice of the whole string
    let v_cont: Vec<&str> = s_slice.split('\n').collect(); // to be able to index, splitting on "\n" and stuffing in a vector
    let lines:usize = v_cont.len()-1;    // e.g if string length is four, max index is three since it begins w. zero!

then you can to things to the text like this:
(in this case I'm just assembling it to a new line, you may look for characters or what not...)

let mut x:usize = 0; let mut string_cont1 = String::new(); while x < lines {
string_cont1.push_str(v_cont[x]);   // doing things string_cont1.push_str("\n");           // doing things..
x += 1;
}


Testing a string to check it's a numeric

In the while x < lines loop, in the example above, you can do this to check if each item (v_cont[x]) from the vector is a number:
any function {

    if v_cont[x].to_string().chars().all(char::is_numeric) { 
            print!("yes ");                             
    }else{ 
            print!("no "); 
    
     x += 1;

}

Comparing usize to & str for equality

I had a problem comparing line numbers, input from keyboard, against index number from a vector. This is how that was solved. I read in a file as string and the idea was to withhold the input line numbers from what I wrote it back to a disk. The hard thing was to compare usize numbers (index for a vector)  on the left with &str type numbers (input number from the keyboard) on the right:

    while x < v_cont_lines {                // for each line in the file (x is the index for a vector

        // here v_inp_num[i] is the contet of an item in another vevtor 
        for i in (0..v_inp_num_len) {    // the line number is checked against all items of the input vector  
            if &*x.to_string() == v_inp_num[i]{             // this worked!!
                found = true;                      // if this line number (x) is a part of the input vector
                                                             // 'found' will be set to true
            }
        }
        if found == false {                        // if false this line should not be withheld, push to result string
            result.push_str(v_cont[x]);
            result.push_str("\n");
        }else{
            found = false;                            // but if 'found' is true, I have to reset 'found' to false again
        }
        x += 1;
    }


Converting number to a string

              let mut k =  n * 1000.0;
               k = f32::trunc(k) / 10.0;
               let l:String=k.to_string();            // this is necessary to trigger the casting mechanism!

to create filenames in a loop using the number 

               let name ="fractal".to_string() + &l + &png;


















Inga kommentarer:

Skicka en kommentar