Regular Expressions and The Quran (in "Tutorials")

Geschrieben von: Hanif, am 12.Feb.2010

Regular Expressions (abbreviation: RegEx) are signs which built in an intelligent and easy to understand system. We have the ability to use this system in Openquran and improve our work much faster and even exactly. If users of OpenQuran doing research to the counts and calculations of Letters/words/verses and there gematrical values than its very important to understand the idea behind RegEx.

RegEx is mostly used in programming languages, databases, search engines, Spam filters etc. It’s easy to understand but needs from us more the artistic talent on how to build the RegEx definitions. There are also a lot of Sites which have more detailed informations on how to use this small powerful language, but I would like easily show how to use with the Arabic language and specially for the Quran.

The theoretical part

The next table shows many signs (orange color) and on the right side the meaning of these signs.

.

a dot replaces any sign like letter, number oder space

*

a star after a sign means: this sign can exist even many times or it does not exist at all

.*

The example finds this whole string

[a-z]*

finding only small letters or none

(?i)

finding small or capital letters

th.*s

find th………….s

^A

The text must begin with  A

s$

The text has to end with s

 

[a-z|A-Z]

Only small or capitel letters

[^a-z]

find none of these letters

[a1]

find a or 1

[^abcd]

find none of a b c or d

 

s|t

s or t

M(e|i)*r

(e or i) find: Mr, Mer, Meir, Mier or Mir

M(e|i)?r

(e or  i) the same as befor but also find Miiiiieeeeer

a+

finds at least 1 a or many aaaaaaa…..

a{5}

finds 5 times aaaaa

a{3,}

a must be 3 times or more 

a{,3}

a exits 3 times or lesser

a{3,8}

a must exist 3 to 8 times

w

finds letter chars

\W

finds no letter cahrs

\d

finds only numbers [0-9]

\D

finds no numbers

\s

finds white space

\S

finds no white space

\b

sign/s limit \bword\b

\B

sign/s ar not limited

\t

tab

|

or

\

 

[ ]

contains many differt signs

( )

contains a group of sign/string

{x,y}

 

The practical part (Using in Openquran)

To use RegEx in Openquran we have to set the checkbox "Regular expressions" to true, and choose the table (translation) in which we want to search; here we´ll choose the table quran which is automatically installed with OQ.

Important:

  • The orange colored signs are to be used/typed in the searchbox.
  • The gray colored sentences are the result of our search
  • If the calculation needs too much time, you can stop any time by pressing ESC button.
  • We can also use the analyze function (right click on the seachbox) instead of the search button. It is much faster and shows detailed informations about the calculation.
  • If you see a small star on the right side of a number 304* that means this number is divisible to 19
  • Notic that the most confusing part in our work is by combinig arabic chars with latin signs. That´s why we´re forced to change the direction of the write method in some examples to be "RightToLeft" (RTL).

Counts of chars:


\w

This will count any letter found in the whole quran. (run Analyze "right click" instead of normal search. Cause OQ colorizes every sign to red)
//
326030 Entries (328158 with Basmalah/s* ), 6234 Verses in 114* Suras


^\w{7}\b

View verses which begin with a 7 letters word like ( واقيموا الصلوه ) in verse 2:43
// 304* Entries, 304* Verses in 77 Suras


\b\w{6}\b

View any word which consists of 6 letters
// 11163 Entries (11387 with Basmalah/s ), 4669 Verses in 112 Suras


^[\w| ]{19}ا

search for Alif letter after counting 19 leters or spaces
// 710 Entries, 710 Verses in 95* Suras


^(\w[ ]*){114}ا

search for Alif letter after counting 114 letters (without counting spaces)
// 57* Entries, 57* Verses in 23 Suras


^الرحمن

find verses which begin with الرحمن
3 Entries, 3 Verses in 3 Suras. Count Sura Numbers: 76*


الرحيم$

find verses which end with الرحيم
33 Entries, 33 Verses in 19* Suras, count dura numbers 551*,  Letters count of all found verses 1444*


ل{2}

find all double لل
3253 Entries (3365 with Basmalah/s ), 2147* Verses in 96 Suras


[ب|س|م|ا|ل|ه|ر|ح|ن|ي]

find the letter B or S or M etc (letters of Basmalah)
221654* Entries (19*19*614) (223782* with Basmalah/s* ), 6234 Verses in 114* Suras, Count Sura Numbers: 6555*


[^ب|ج|د|و|ز|ف|ش|ت|ث|خ|ذ|ض|ظ|غ]

find any letter accept B or G or D etc.. (None initial letters)
315155 Entries (317507 with Basmalah/s ), 6234 Verses in 114* Suras


Searching for some unique words like Allh = الله it´s important to have some knowledge about the different forms of this words. Some times its written LLH = لله or FALLH = فالله or BALLH = بالله or WLLH = ولله  but not DLLH = ضلله . The best definition we found to get all words is as follow:

( |ا|و|ف|^)لله( |$)

2698* Entries (2810 with Basmalah/s ), 1820 Verses in 85 Suras


\bالقرءان\b

find the word ALQRAAN exactly the way its written
43 Entries, 43 Verses in 28 Suras, Count Sura Numbers: 893*


\bكتب|كتاب\b

find the word KTB or KTAB exactly the way they are written
102 Entries, 95* Verses in 45 Suras


(\b(ال|و|ف|ء|ب)*ا[ء]*له(ا|ك|كم|نا|ه|ين|ت|تهم|تي)*\b)|( |ا|و|ف|^)لله( |$)|اللهم

This one is a difficult example showing the roots of ALH اله in all different shapes used in the Quran
2849 Entries, 1878 Verses in 86 Suras


[^ع]ابدا

Here we need the word ABDA ابدا but without the letter ع before
28 Entries, 28 Verses in 15 Suras


ن[ ]*و[ ]*ن

This will find يمنون in one word or in 2 words beside each other بالسنين ونقص
247* Entries, 235 Verses in 65 Suras


To look for every verse which begins with a specified letter and ends with another specified letter we do as follow:
We search for verses that begin with Ba ب and end with Mim م like the Basmalah (first verse in surah 1):

^ب.+م$

OpenQuran finds 3 verses in such manner which consists of 114 (19*6) letters.

If we also search for verses that begin with Saad ص and end with Nun ن (last verse of surah 1) we´ll find also 3 verses that have a gematrical value of 9728 (19*512):

^ص.+ن$

We can also include both mehtods in one definition. For example we search for verses which begin with letter Be ب or Saad ص and end with Mim م or Nun ن :

^[ب|ص].+[م|ن]$

In such manner we find 46 verses which have the gematrical value of 130473 (19*6867)


If we want to look for all words which begin with a specified letter (e.g. the letter س )

\bس+

Some times letters have 2 or more different shapes like Alif and Hamza. The following example will find all words begining with Alif or Hamza:

\b[ا|ء]+


At the begining of the verse, there must be (a word with 3 letters) than (Space) than (ALLH). same as بسم الله

^\w{3} الله


Additional search functions

We also included some important functions to our search ability in Open Quran, such as:

  1. count(k=2);  // find verses which have only 2 letters k
  2. count(k<3); // find verses which have 2 letters k or lesser
  3. count(k>1); // find verses which have 2 letters k or more
  4. cutspace(يس); // verses have no spaces before searching for يس
  5. gv(l=10); // find letters which have gematrical value of 10
  6. gv(w=19); // find words which have gematrical value of 19
  7. gv(v=228); // find verses which have gematrical value of 228

To use one of these functions we do as follow:

In the search tab we press the right button to open the menu, choose "Search Unique" (or press Ctrl+U)
 

a small window appears where we choose the function we need from the combo box and define our search letter/word/phrase

In the count function we also define the occurence of our search
Hint: Within the count(); functions we can also use Regex!

References to Regular Expressions

Regular Expression Basic Syntax Reference

Regular Expression HOWTO

Regular Expressions Cheat Sheet (V2)

BOOKS TO REGULAR EXPRESSIONS

If you have any question regarding RegEx in Openquran than we are glad to help but please forgive my bad english

Salam

Ähnliche Themenopenquran download

6 Kommentare

rlc
08.Mar.2014
perfect!!! thx a lot

Reply to rlc

mahdi
15.Oct.2013
Hello
why in sura Al-Bagharah : ابرهم
And in other surahs: ابرهیم
?
thank you

Reply to mahdi

admin: 17.Oct.2013Hello Mahdi

We already wrote a post in german language about this issue here: Abram = Abraham. explaining that the Quran confirming previous scriptures such as Turah. And if u read the story of Abraham in Turah (1. Mose - Chapter 17) than you´ll understand that Abraham had 2 names, in the past and later as a prophet.
Muhamed
20.Jun.2013
Assalam Aleikum wr wb.
Thanks for the great article I almost lost hope in regex'ing arabic characters! I'm currently working on a Quran site. I'm trying to get some arabic characters found by regular expressions (in php) but I'm struggling to get it working. Hopefully you can help me! Please contact me.
Wa salam
Brother Muhamed

Reply to Muhamed

admin: 21.Jun.2013Salam Brother

We also using Regex on our Quran site: http://www.alquran.eu
Php is a bit different. you need to use preg_functions with u-modifier for none latin languages. See here: http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

Tell me, what exactly u need
Salam
Kommentar schreiben
Wer SATT ist, wird NIE einen HUNGERNDEN verstehen.
Es kommt nicht darauf an, wie ALT man ist, sondern WIE man alt ist.
AUFRICHTIGKEIT ist wahrscheinlich die verwegenste Form der TAPFERKEIT.
Ein WAHRHAFT großer Mensch wird weder einen Wurm ZERTRETEN, noch vor dem Kaiser KRIECHEN.
Was Rednern an TIEFE fehlt, ersetzen sie durch LÄNGE.
Sehnsucht ist das Los des Geistes, der einmal Gottes Schönheit geschaut hat.
Unruhig ist unser Herz, bis es ruht in DIR.
Wandelt mit den FÜßEN auf der ERDE; mit den HERZEN aber seid im HIMMEL.
Bewußt-Sein ist Selbst-Erlebnis durch Selbst-Betrachtung.
Doch die SCHÖPFUNG - bleibt ein WUNDER.
Nur der Schweigende hört.
Suche nicht ANDERE, sondern dich SELBST zu übertreffen.
Man muss etwas Neues machen, um etwas Neues zu sehen.
Jede Weisheit ist ein Geschenk Gottes
Wer sich nicht mehr WUNDERN kann, der ist SEELISCH bereits TOD.
Menschen wünschen sich Geduld, aber so schnell wie möglich