- xDebug
- use XDebugToolkit to convert the XDebug cachegrind (= “xdebug.log“) to a dot graph
http://code.google.com/p/xdebugtoolkit/
# svn co http://xdebugtoolkit.googlecode.com/svn/tags/0.1.3/xdebugtoolkit/ xdebugtoolkit
# ./cg2dot.py cachegrind.out.23328 > cachegrind.out.23328.dot
# cd /usr/ports/graphics/graphviz ; make install ; rehash
# dot -Tpng -otest.png
Combine to one command:
# ./cg2dot.py cachegrind.out.23328 | dot -Tpng -otest.png
this might be useful /usr/ports/graphics/py-pydot ??
Windows version of Graphviz is also available at: http://www.graphviz.org/Download..php
XDot: Interactive viewer for Graphviz dot files
http://code.google.com/p/jrfonseca/wiki/XDot
- CacheGrind
- WinCacheGrind
- CacheGrindvisualizer
http://code.google.com/p/cachegrindvisualizer/
- Graphviz - Graph Visualization Software
http://graphviz.org/About.php
- ZGRViewer, a GraphViz/DOT Viewer
http://zvtm.sourceforge.net/zgrviewer.html
Dot Language
http://en.wikipedia.org/wiki/DOT_language
- Scalable Vector Graphics (SVG)
http://en.wikipedia.org/wiki/Scalable_Vector_Graphics
Friday, October 30, 2009
Read UTF-8 Unicode Characters on Console on FreeBSD
language-locale
中文語系設定,我分別列出設定為utf8和big5的中文設定方法。 以 tcsh 為例:
使用者登入後,會讀取 /etc/csh.cshrc & /etc/csh.login
1. vi /etc/csh.login
2. # 遠端能輸入中文 setenv ENABLE_STARTUP_LOCALE zh_TW.Big5
setenv LC_CTYPE is_IS.ISO8859-1
setenv LANG zh_TW.Big5
# console 下輸入中文
# 5.x is_IS.ISO8859-1
# 4.x is_IS.ISO_8859-1
LC 代表的意思
3. :h option-list 查詢所有設定值
4. REF:http://www.study-area.org/tips/vim/Vim-9.html
中文 locale 的設定- http://freebsddoc.twbbs.org/zh-tut/setlocale.html
下面兩套setenv,上面是Big5下面是utf-8。
#setenv LC_ALL en_US.ISO8859-1
setenv LC_COLLATE zh_TW.Big5
setenv LC_CTYPE zh_TW.Big5
setenv LC_MESSAGES zh_TW.Big5
setenv LC_MONETARY zh_TW.Big5
setenv LC_NUMERIC zh_TW.Big5
setenv LC_TIME en_US.ISO8859-1
setenv LANG zh_TW.Big5
#setenv LC_ALL en_US.UTF-8
setenv LC_COLLATE en_US.UTF-8
setenv LC_CTYPE en_US.UTF-8
setenv LC_MESSAGES en_US.UTF-8
setenv LC_MONETARY en_US.UTF-8
setenv LC_NUMERIC en_US.ISO8859-1
setenv LC_TIME en_US.ISO8859-1
setenv LANG en_US.UTF-8
Reference:
http://sites.google.com/site/iwhiori/FreeBSD/language-locale
http://blog.yzlin.org/2008/05/14/22/
http://www.jeffhung.net/blog/articles/jeffhung/742/
http://blog.yzlin.org/2007/12/12/3/
中文語系設定,我分別列出設定為utf8和big5的中文設定方法。 以 tcsh 為例:
使用者登入後,會讀取 /etc/csh.cshrc & /etc/csh.login
1. vi /etc/csh.login
2. # 遠端能輸入中文 setenv ENABLE_STARTUP_LOCALE zh_TW.Big5
setenv LC_CTYPE is_IS.ISO8859-1
setenv LANG zh_TW.Big5
# console 下輸入中文
# 5.x is_IS.ISO8859-1
# 4.x is_IS.ISO_8859-1
LC 代表的意思
3. :h option-list 查詢所有設定值
4. REF:http://www.study-area.org/tips/vim/Vim-9.html
中文 locale 的設定- http://freebsddoc.twbbs.org/zh-tut/setlocale.html
下面兩套setenv,上面是Big5下面是utf-8。
#setenv LC_ALL en_US.ISO8859-1
setenv LC_COLLATE zh_TW.Big5
setenv LC_CTYPE zh_TW.Big5
setenv LC_MESSAGES zh_TW.Big5
setenv LC_MONETARY zh_TW.Big5
setenv LC_NUMERIC zh_TW.Big5
setenv LC_TIME en_US.ISO8859-1
setenv LANG zh_TW.Big5
#setenv LC_ALL en_US.UTF-8
setenv LC_COLLATE en_US.UTF-8
setenv LC_CTYPE en_US.UTF-8
setenv LC_MESSAGES en_US.UTF-8
setenv LC_MONETARY en_US.UTF-8
setenv LC_NUMERIC en_US.ISO8859-1
setenv LC_TIME en_US.ISO8859-1
setenv LANG en_US.UTF-8
Reference:
http://sites.google.com/site/iwhiori/FreeBSD/language-locale
http://blog.yzlin.org/2008/05/14/22/
http://www.jeffhung.net/blog/articles/jeffhung/742/
http://blog.yzlin.org/2007/12/12/3/
只有 SRC 沒有 DOC - Protocol
只有 SRC 沒有 DOC - Protocol
by thinker
2 Columns
關鍵字:
coding
使用 open source 程式碼時,經常只有 source code ,而沒有 design document 。如果程式寫的不好,變數、函數名稱亂取,那就更加麻煩。了解這樣的系統,豈止「頭痛」了得。
要分析這樣的程式碼,如我在 模組的關聯度 所言,必需先了解模組的切割和關聯。要了解模組的切割,必需先了解其 interface 。 Interface 主要是由 data structure 和 exported function 所組成。了解 interface ,就必需先了解這些 structure 和 function 的作用。於是乎,一切從了解 structure 的作用開始。
了解 structure 和 function 的作用,除了從內部了解,除外部觀察也很重要。大部份人 (就是鄙人) ,可能很快的鑽進 function 的程式碼裡,試圖透過了解程式的流程,進而了解 function 的作用和 structure 的角色。這樣作法,往往使自己落入隱晦而未知的資訊陷阱裡。這個資訊陷阱來自於程式碼本身的資訊不完善,許多隱性資訊在,在 coding function 時,未被完整表達。一旦掉入這類陷阱裡,往往想破頭也無法了解。
然而,這些資訊如果未被寫進程式碼裡,程式又如何能運作。事實上,只是資訊被分散了,被分散到使用這些 function 的程式碼裡。這部分資訊,我稱之為 protocol 。 Interface 的使用規則,包括使用順序、相依性、和外部狀態,總合而成為 interface 的 protocol 。 Interface 的使用者,必需依循這個 protocol ,以確保 module 正常運作。
protocol 的資訊,往往未被寫在 interface 的 implementation 裡,也就 function 裡。文件和註解,往往也忘了記載,只留在 programmer 的腦袋裡。這時侯,單從 interface 的 implementation 無法得到這些資訊。這些資訊都暗埋在使用案例裡。也就是,我們必需透過了解其它 module 在什麼時機、呼叫哪個 function 、給予什麼資訊,進而建構出 protocol 的全貌。
因此,除了看 implementation 的流程之外,找尋案例也是很重要的步驟。找尋案例,可以透過 grep ,這個簡單好用的工具。 grep 的好處是幾乎隨手可得,至少在 UNIX like 的系統上。然而, grep 的效能不彰,必需花很多時間才能從大量資料中,找到你要的資訊。
我個人推薦使用 GNU GLOBAL ,這個 tag system 。 GLOBAL 事先為程式碼建立 database ,可以查 function 的位置,也可以查呼叫特定 function 的位置。這些都在一瞬間就完成。 trace source code 的人,是無法忍受等待的。一瞬間找到所要的資訊是必需的,否則只會惱怒。
最後更新時間: 2007-06-13 10:55:34 CST | 引用
查詢:
COMMENTS:
on 2007-06-14 01:59:54 CST
av said ..
丟給 VC, 不必 compile, 用 Visual Assist 就好了!
on 2007-06-14 10:24:52 CST
Thinker said ..
習慣在 terminal 工作,一時也沒想到 IDE 環境。 Open Source 的 solution ,可以考慮 source-navigator http://sourcenav.sourceforge.net/ 。
by thinker
2 Columns
關鍵字:
coding
使用 open source 程式碼時,經常只有 source code ,而沒有 design document 。如果程式寫的不好,變數、函數名稱亂取,那就更加麻煩。了解這樣的系統,豈止「頭痛」了得。
要分析這樣的程式碼,如我在 模組的關聯度 所言,必需先了解模組的切割和關聯。要了解模組的切割,必需先了解其 interface 。 Interface 主要是由 data structure 和 exported function 所組成。了解 interface ,就必需先了解這些 structure 和 function 的作用。於是乎,一切從了解 structure 的作用開始。
了解 structure 和 function 的作用,除了從內部了解,除外部觀察也很重要。大部份人 (就是鄙人) ,可能很快的鑽進 function 的程式碼裡,試圖透過了解程式的流程,進而了解 function 的作用和 structure 的角色。這樣作法,往往使自己落入隱晦而未知的資訊陷阱裡。這個資訊陷阱來自於程式碼本身的資訊不完善,許多隱性資訊在,在 coding function 時,未被完整表達。一旦掉入這類陷阱裡,往往想破頭也無法了解。
然而,這些資訊如果未被寫進程式碼裡,程式又如何能運作。事實上,只是資訊被分散了,被分散到使用這些 function 的程式碼裡。這部分資訊,我稱之為 protocol 。 Interface 的使用規則,包括使用順序、相依性、和外部狀態,總合而成為 interface 的 protocol 。 Interface 的使用者,必需依循這個 protocol ,以確保 module 正常運作。
protocol 的資訊,往往未被寫在 interface 的 implementation 裡,也就 function 裡。文件和註解,往往也忘了記載,只留在 programmer 的腦袋裡。這時侯,單從 interface 的 implementation 無法得到這些資訊。這些資訊都暗埋在使用案例裡。也就是,我們必需透過了解其它 module 在什麼時機、呼叫哪個 function 、給予什麼資訊,進而建構出 protocol 的全貌。
因此,除了看 implementation 的流程之外,找尋案例也是很重要的步驟。找尋案例,可以透過 grep ,這個簡單好用的工具。 grep 的好處是幾乎隨手可得,至少在 UNIX like 的系統上。然而, grep 的效能不彰,必需花很多時間才能從大量資料中,找到你要的資訊。
我個人推薦使用 GNU GLOBAL ,這個 tag system 。 GLOBAL 事先為程式碼建立 database ,可以查 function 的位置,也可以查呼叫特定 function 的位置。這些都在一瞬間就完成。 trace source code 的人,是無法忍受等待的。一瞬間找到所要的資訊是必需的,否則只會惱怒。
最後更新時間: 2007-06-13 10:55:34 CST | 引用
查詢:
COMMENTS:
on 2007-06-14 01:59:54 CST
av said ..
丟給 VC, 不必 compile, 用 Visual Assist 就好了!
on 2007-06-14 10:24:52 CST
Thinker said ..
習慣在 terminal 工作,一時也沒想到 IDE 環境。 Open Source 的 solution ,可以考慮 source-navigator http://sourcenav.sourceforge.net/ 。
模組的關聯強度
模組的關聯強度
by thinker
2 Columns
關鍵字:
coding
不論是要 porting 或者改寫,甚至是切割一個軟體,一開始大都很快的陷入一團混沌。這時最重要的是,儘快的了解模組的個別功能。欲瞭解模組的個別功能,在反向工程中,必需透過釐清模組之間的訊息交換過程,從而瞭解模組在系統中所扮演的角色。
大部分人的做法,透過追蹤程式碼的呼叫情形,一步一步的組織出模組間的互動模式,從而猜測其功能。然而,模組間的互動往往是分散的,而非集中在一個程式碼區塊。再加上不同模組間呼叫是交錯,而非連續,使的追蹤更為困難。透過追蹤程式碼所得到的訊息是鎖碎而片段的,必需透過不斷的猜測、修正和驗證,才能讓模組的原貌慢慢現形。
模組之間的關係是錯綜複雜的,若能抓住其主要的脈絡,加以分析,卻能得到提網切領的功效,使其外貌浮現。但,脈絡的搜尋何其容易,往往是遍尋而不著。雖然如此困難,主要脈絡卻有其徵兆。越是重要的脈絡,其互動、重復的次數越是頻繁。基於這個原則,若能將分散的呼叫過程加以統計、分析,就能找出重要的脈絡。
其中,模組之間的關聯性,若能透過圖形加以表示。關聯性越強的模組之間,有更粗的連結,越強的兩個模組,位置上就更加接近。如此,我們可以很快的了解有哪些模組間的互動是重要的,而另外一些則是旁支末節。依此工具,進行有系統的分析,應該能更有效率的瞭解一系統。
Doxygen 一類的工具,其實已經幫我們建立出這些關連了。但卻沒加以統計,並以視覺的方式表現。若能在 doxygen 上加工,或利用其產出之資料,加以統計和分析。應該能快速的得到這些資訊,組出我們所需的工具。目前我還沒不知道有現成的工具可以做到這些分析,若有人知,還請捎一封信過來,或留個言。再不然,來日有空時,或許我會站在其它工具的肩膀上,做出這類的工具。
最後更新時間: 2007-04-24 01:29:44 CST | 引用
查詢:
COMMENTS:
on 2007-04-24 03:36:12 CST
Kuon said ..
Doxygen 這類可以建立 xref 的程式, 也算是一種 Source Code Analysis, 但是我印象中似乎並沒有Loop Emulation的實作, 也就是表達不出特定實作中的函數呼叫次數, 這可能還是得靠Run-time Profiling來補足.
on 2007-04-24 16:25:08 CST
Thinker said ..
我個人比較偏向靜態分析,透過靜態的資訊較不複雜。靜態資訊包括連結的數目,相對於動態的呼叫次數,或許更能表現模組之間的關連性。必竟從 programmer 的角度來看,靜態的關聯度才是造成系統複邏輯雜度的主要因素。
by thinker
2 Columns
關鍵字:
coding
不論是要 porting 或者改寫,甚至是切割一個軟體,一開始大都很快的陷入一團混沌。這時最重要的是,儘快的了解模組的個別功能。欲瞭解模組的個別功能,在反向工程中,必需透過釐清模組之間的訊息交換過程,從而瞭解模組在系統中所扮演的角色。
大部分人的做法,透過追蹤程式碼的呼叫情形,一步一步的組織出模組間的互動模式,從而猜測其功能。然而,模組間的互動往往是分散的,而非集中在一個程式碼區塊。再加上不同模組間呼叫是交錯,而非連續,使的追蹤更為困難。透過追蹤程式碼所得到的訊息是鎖碎而片段的,必需透過不斷的猜測、修正和驗證,才能讓模組的原貌慢慢現形。
模組之間的關係是錯綜複雜的,若能抓住其主要的脈絡,加以分析,卻能得到提網切領的功效,使其外貌浮現。但,脈絡的搜尋何其容易,往往是遍尋而不著。雖然如此困難,主要脈絡卻有其徵兆。越是重要的脈絡,其互動、重復的次數越是頻繁。基於這個原則,若能將分散的呼叫過程加以統計、分析,就能找出重要的脈絡。
其中,模組之間的關聯性,若能透過圖形加以表示。關聯性越強的模組之間,有更粗的連結,越強的兩個模組,位置上就更加接近。如此,我們可以很快的了解有哪些模組間的互動是重要的,而另外一些則是旁支末節。依此工具,進行有系統的分析,應該能更有效率的瞭解一系統。
Doxygen 一類的工具,其實已經幫我們建立出這些關連了。但卻沒加以統計,並以視覺的方式表現。若能在 doxygen 上加工,或利用其產出之資料,加以統計和分析。應該能快速的得到這些資訊,組出我們所需的工具。目前我還沒不知道有現成的工具可以做到這些分析,若有人知,還請捎一封信過來,或留個言。再不然,來日有空時,或許我會站在其它工具的肩膀上,做出這類的工具。
最後更新時間: 2007-04-24 01:29:44 CST | 引用
查詢:
COMMENTS:
on 2007-04-24 03:36:12 CST
Kuon said ..
Doxygen 這類可以建立 xref 的程式, 也算是一種 Source Code Analysis, 但是我印象中似乎並沒有Loop Emulation的實作, 也就是表達不出特定實作中的函數呼叫次數, 這可能還是得靠Run-time Profiling來補足.
on 2007-04-24 16:25:08 CST
Thinker said ..
我個人比較偏向靜態分析,透過靜態的資訊較不複雜。靜態資訊包括連結的數目,相對於動態的呼叫次數,或許更能表現模組之間的關連性。必竟從 programmer 的角度來看,靜態的關聯度才是造成系統複邏輯雜度的主要因素。
程式碼的可讀性比較
程式碼的可讀性比較
by thinker
2 Columns
關鍵字:
coding
以下就程式碼的可讀性即其產生之 binary 進行分析。依其順序,逐漸改進其可讀性。透過比較,我們能較透澈的了解可讀性的性質。
下面的程式,是一個分析 HTTP request line 的 parser 。 Request line 的格式如下
GET /path/to/resource HTTP/1.1
程式的目的是將 request line 裡的三個欄位,分別取出為 method 、 uri 和 protocol 。
程式一
最平舖直訴的方式,這大概是每個 programmer 都會經過的階段吧!!
001 #include
002 #include
003 #include
004
005 typedef struct {
006 char *method;
007 char *uri;
008 char *proto;
009 } http_req;
010
011 int http_req_parse(http_req *req, const char *buf, int sz) {
012 int i, prev;
013
014 for(i = 0; i < sz; i++) { 015 if(buf[i] == ' ') break; 016 if(buf[i] == '\n' || buf[i] == '\r') 017 return -1; 018 } 019 if(i == sz || i == 0) return -1; 020 req->method = (char *)malloc(i + 1);
021 strncpy(req->method, buf, i);
022 req->method[i] = 0;
023
024 prev = ++i;
025 for(; i < sz; i++) { 026 if(buf[i] == ' ') break; 027 if(buf[i] == '\n' || buf[i] == '\r') break; 028 } 029 if(i == sz || i == prev || buf[i] != ' ') { 030 free(req->method);
031 return -1;
032 }
033 req->uri = (char *)malloc(i - prev + 1);
034 strncpy(req->uri, buf + prev, i - prev);
035 req->uri[i - prev] = 0;
036
037 prev = ++i;
038 for(; i < sz; i++) { 039 if(buf[i] == ' ') break; 040 if(buf[i] == '\n' || buf[i] == '\r') break; 041 } 042 if(i != sz || i == prev) { 043 free(req->method);
044 free(req->uri);
045 return -1;
046 }
047 req->proto = (char *)malloc(i - prev + 1);
048 strncpy(req->proto, buf + prev, i - prev);
049 req->proto[i - prev] = 0;
050
051 return 0;
052 }
053
054 int main(int argc, const char *argv[]) {
055 const char *data = "GET /test.html HTTP/1.1";
056 http_req req;
057
058 if(http_req_parse(&req, data, strlen(data)) < 0) { 059 fprintf(stderr, "error to parse request line!\n"); 060 return 1; 061 } 062 063 printf("request line: %s\n", data); 064 printf("method: %s\n", req.method); 065 printf("uri: %s\n", req.uri); 066 printf("protocol: %s\n", req.proto); 067 068 return 0; 069 } 重複的動作 將不斷重複的動作取出,並定個有意義的名稱。如此程式不但變小,而且因為 function 名稱所帶來的意涵,加強了程式的可讀性。由此可知,將程式的部分流程變成 function ,並附于適當的名稱,能改善程式的可讀性。 strncspn() 其實就是 strcspn() 的變形,可看 Linux 或 FreeBSD 的 man page 。而 strndup() 就是 strdup() 的變形。透過這些有意義的名稱,程式碼的流程更有意義,提供更多線索,更適合大腦解讀。 001 #include
002 #include
003 #include
004
005 typedef struct {
006 char *method;
007 char *uri;
008 char *proto;
009 } http_req;
010
011 int strncspn(const char *s, int max, const char *charset) {
012 int i, j, cs_sz;
013 char c;
014
015 cs_sz = strlen(charset);
016 for(i = 0; i < max && s[i] != 0; i++) { 017 c = s[i]; 018 for(j = 0; j < cs_sz; j++) { 019 if(c == charset[j]) return i; 020 } 021 } 022 return max; 023 } 024 025 char *strndup(const char *s, int max) { 026 int sz = strlen(s); 027 char *buf; 028 029 if(sz > max) sz = max;
030 buf = (char *)malloc(sz + 1);
031 memcpy(buf, s, sz);
032 buf[sz] = 0;
033
034 return buf;
035 }
036
037 int http_req_parse(http_req *req, const char *buf, int sz) {
038 const char *substr, *last, *next;
039 int substr_sz;
040
041 last = buf + sz;
042
043 substr_sz = strncspn(buf, sz, " \r\n");
044 if(substr_sz == sz || substr_sz == 0 || buf[substr_sz] != ' ')
045 return -1;
046 req->method = strndup(buf, substr_sz);
047
048 substr = buf + substr_sz + 1;
049 substr_sz = strncspn(substr, last - substr, " \r\n");
050 next = substr + substr_sz;
051 if(substr_sz == 0 || next == last || *next != ' ') {
052 free(req->method);
053 return -1;
054 }
055 req->uri = strndup(substr, substr_sz);
056
057 substr = next + 1;
058 substr_sz = strncspn(substr, last - substr, " \r\n");
059 next = substr + substr_sz;
060 if(next != last) {
061 free(req->method);
062 free(req->uri);
063 return -1;
064 }
065 req->proto = strndup(substr, substr_sz);
066
067 return 0;
068 }
069
070 int main(int argc, const char *argv[]) {
071 const char *data = "GET /test.html HTTP/1.1";
072 http_req req;
073
074 if(http_req_parse(&req, data, strlen(data)) < 0) { 075 fprintf(stderr, "error to parse request line!\n"); 076 return 1; 077 } 078 079 printf("request line: %s\n", data); 080 printf("method: %s\n", req.method); 081 printf("uri: %s\n", req.uri); 082 printf("protocol: %s\n", req.proto); 083 084 return 0; 085 } 邏輯拆離 前面的程式,將 parse 的過程中的數個邏輯交叉混合在一起。下面的程式將這些邏輯個別分離,變成獨立的區塊。將相關邏輯集中處理,而非交錯在一起,導至讀者必需不斷的在邏輯之間切換。 另一方面,將邏輯拆離,能減少透過 variable 保留和傳遞資訊的狀況。如第一和第二個程式,透過變數 i 和 substr 保留和傳遞目前處理的狀態,以在 function 前後傳遞資訊。這迫使讀者必需追蹤 variable 的內容,才能理解每一段程式碼的作用和正確性。 而邏輯拆離後,條件判斷也減少了。使用條件式,經常是一種邏輯上的修補行為,對意外狀況的處置。然而大部分情況並非不可避免的,只需適當的安排,將邏輯拆離,既可避免這種修補的動作。 而邏輯拆離後,每一個程式碼區塊的功能也單純化。讀者更易理解,程式撰寫時,也更能確定程式的正確性。 下例,一開始先把確定換行符號是否在字串裡,以排除換行的狀況。接著取得空白符號的位置。最後複製字串,並成 method 、 uri 和 proto 的內容。 001 #include
002 #include
003 #include
004
005 typedef struct {
006 char *method;
007 char *uri;
008 char *proto;
009 } http_req;
010
011 int strncspn(const char *s, int max, const char *charset) {
012 int i, j, cs_sz;
013 char c;
014
015 cs_sz = strlen(charset);
016 for(i = 0; i < max && s[i] != 0; i++) { 017 c = s[i]; 018 for(j = 0; j < cs_sz; j++) { 019 if(c == charset[j]) return i; 020 } 021 } 022 return max; 023 } 024 025 char *strndup(const char *s, int max) { 026 int sz = strlen(s); 027 char *buf; 028 029 if(sz > max) sz = max;
030 buf = (char *)malloc(sz + 1);
031 memcpy(buf, s, sz);
032 buf[sz] = 0;
033
034 return buf;
035 }
036
037 int http_req_parse(http_req *req, const char *buf, int sz) {
038 const char *substr, *last;
039 int i;
040 const char *fss[4];
041
042 sz = strncspn(buf, sz, "\r\n");
043 last = buf + sz;
044
045 substr = buf;
046 for(i = 1; i < 4; i++) { 047 fss[i] = substr + strncspn(substr, last - substr, " "); 048 if(fss[i] == last) break; 049 substr = fss[i] + 1; 050 } 051 if(i != 3) 052 return -1; 053 054 fss[0] = buf; 055 fss[3] = last; 056 for(i = 0; i < 3; i++) { 057 if(i > 0) fss[i]++;
058 if((fss[i + 1] - fss[i]) < 1) 059 return -1; 060 } 061 062 req->method = strndup(fss[0], fss[1] - fss[0]);
063 req->uri = strndup(fss[1], fss[2] - fss[1]);
064 req->proto = strndup(fss[2], fss[3] - fss[2]);
065
066 return 0;
067 }
068
069 int main(int argc, const char *argv[]) {
070 const char *data = "GET /test.html HTTP/1.1";
071 http_req req;
072
073 if(http_req_parse(&req, data, strlen(data)) < 0) { 074 fprintf(stderr, "error to parse request line!\n"); 075 return 1; 076 } 077 078 printf("request line: %s\n", data); 079 printf("method: %s\n", req.method); 080 printf("uri: %s\n", req.uri); 081 printf("protocol: %s\n", req.proto); 082 083 return 0; 084 } 再次簡化 前一個程式將邏輯分離,這個程式將分離後的程式再改進。例如將重用性高的部分再取出,如 strnchrs() 其實就是 strchr 的變形,這樣的 function 功能單純,可重用性高。再加上賦于一個有意義的名稱,增加了程式的可讀性。 另外,把幾個 loop 替換成直接的 statement ,直接的陳述往往比 loop 和 condition 來的易讀。但,如果重複的次數太多時,當然使用 loop 才是合理的狀況。 001 #include
002 #include
003 #include
004
005 typedef struct {
006 char *method;
007 char *uri;
008 char *proto;
009 } http_req;
010
011 int strncspn(const char *s, int max, const char *charset) {
012 int i, j, cs_sz;
013 char c;
014
015 cs_sz = strlen(charset);
016 for(i = 0; i < max && s[i] != 0; i++) { 017 c = s[i]; 018 for(j = 0; j < cs_sz; j++) { 019 if(c == charset[j]) return i; 020 } 021 } 022 return max; 023 } 024 025 char *strndup(const char *s, int max) { 026 int sz = strlen(s); 027 char *buf; 028 029 if(sz > max) sz = max;
030 buf = (char *)malloc(sz + 1);
031 memcpy(buf, s, sz);
032 buf[sz] = 0;
033
034 return buf;
035 }
036
037 int strnchrs(const char *s, int max, int c, const char *chrs[], int chrs_max) {
038 int i, j = 0;
039
040 for(i = 0; i < chrs_max; i++, j++) { 041 for(; j < max; j++) 042 if(s[j] == c) break; 043 if(j == max) break; 044 chrs[i] = s + j; 045 } 046 047 return i; 048 } 049 050 int http_req_parse(http_req *req, const char *buf, int sz) { 051 int i; 052 const char *last; 053 const char *fss[3], *starts[3]; 054 055 sz = strncspn(buf, sz, "\r\n"); 056 last = buf + sz; 057 058 if(strnchrs(buf, sz, ' ', fss, 3) != 2) 059 return -1; 060 061 starts[0] = buf; 062 starts[1] = fss[0] + 1; 063 starts[2] = fss[1] + 2; 064 fss[2] = last; 065 066 for(i = 0; i < 3; i++) 067 if(starts[i] == fss[i]) return -1; 068 069 req->method = strndup(starts[0], fss[0] - starts[0]);
070 req->uri = strndup(starts[1], fss[1] - starts[1]);
071 req->proto = strndup(starts[2], fss[2] - starts[2]);
072
073 return 0;
074 }
075
076 int main(int argc, const char *argv[]) {
077 const char *data = "GET /test.html HTTP/1.1";
078 http_req req;
079
080 if(http_req_parse(&req, data, strlen(data)) < 0) {
081 fprintf(stderr, "error to parse request line!\n");
082 return 1;
083 }
084
085 printf("request line: %s\n", data);
086 printf("method: %s\n", req.method);
087 printf("uri: %s\n", req.uri);
088 printf("protocol: %s\n", req.proto);
089
090 return 0;
091 }
行數
表面上看來,改進可讀性之後,程式碼的行數突然大增。但是如果仔細算一下有意義的 statement 數目,其實是減少的。之所以會有大增的表象,是來自於 function 、 變數的宣告 、呼叫和留白。但這些是否該例入程式的複雜度裡?本人採取否定的態度。我的計算方法,是將每一個 assign statement 算一行,每一個 for 、 if 、 break 、 return 、 continue 算一行。如此
if(...) break;
會算兩行。而 variable 和 function 的宣告不列入計算。
仔細算下來,最後一個程式比第一個程式還少上數行。
執行檔的大小
-rwxr-xr-x 1 thinker users 6117 May 6 00:06 readability0
-rwxr-xr-x 1 thinker users 6161 May 5 20:58 readability1
-rwxr-xr-x 1 thinker users 6064 May 5 21:30 readability2
-rwxr-xr-x 1 thinker users 6125 May 6 00:06 readability3
依序為前面四個程式的大小。第四個程式的成本是區區數個 bytes ,但可讀性卻大幅的改善。另一方面,本程式規模較小,大程式的的重複性更高,增加可讀性可能會使程式比原來更小。程式愈大,其效能愈大。
結論
程式的可讀性,來自於有意義的名稱、邏輯的分割。而可讀性卻不會增加程式的大小,反而能使程式更精簡、更小、甚至更快。本文並沒有討論到模組的計設和資訊封裝等議題,單就程式流程的可讀性進行討論。
by thinker
2 Columns
關鍵字:
coding
以下就程式碼的可讀性即其產生之 binary 進行分析。依其順序,逐漸改進其可讀性。透過比較,我們能較透澈的了解可讀性的性質。
下面的程式,是一個分析 HTTP request line 的 parser 。 Request line 的格式如下
GET /path/to/resource HTTP/1.1
程式的目的是將 request line 裡的三個欄位,分別取出為 method 、 uri 和 protocol 。
程式一
最平舖直訴的方式,這大概是每個 programmer 都會經過的階段吧!!
001 #include
002 #include
003 #include
004
005 typedef struct {
006 char *method;
007 char *uri;
008 char *proto;
009 } http_req;
010
011 int http_req_parse(http_req *req, const char *buf, int sz) {
012 int i, prev;
013
014 for(i = 0; i < sz; i++) { 015 if(buf[i] == ' ') break; 016 if(buf[i] == '\n' || buf[i] == '\r') 017 return -1; 018 } 019 if(i == sz || i == 0) return -1; 020 req->method = (char *)malloc(i + 1);
021 strncpy(req->method, buf, i);
022 req->method[i] = 0;
023
024 prev = ++i;
025 for(; i < sz; i++) { 026 if(buf[i] == ' ') break; 027 if(buf[i] == '\n' || buf[i] == '\r') break; 028 } 029 if(i == sz || i == prev || buf[i] != ' ') { 030 free(req->method);
031 return -1;
032 }
033 req->uri = (char *)malloc(i - prev + 1);
034 strncpy(req->uri, buf + prev, i - prev);
035 req->uri[i - prev] = 0;
036
037 prev = ++i;
038 for(; i < sz; i++) { 039 if(buf[i] == ' ') break; 040 if(buf[i] == '\n' || buf[i] == '\r') break; 041 } 042 if(i != sz || i == prev) { 043 free(req->method);
044 free(req->uri);
045 return -1;
046 }
047 req->proto = (char *)malloc(i - prev + 1);
048 strncpy(req->proto, buf + prev, i - prev);
049 req->proto[i - prev] = 0;
050
051 return 0;
052 }
053
054 int main(int argc, const char *argv[]) {
055 const char *data = "GET /test.html HTTP/1.1";
056 http_req req;
057
058 if(http_req_parse(&req, data, strlen(data)) < 0) { 059 fprintf(stderr, "error to parse request line!\n"); 060 return 1; 061 } 062 063 printf("request line: %s\n", data); 064 printf("method: %s\n", req.method); 065 printf("uri: %s\n", req.uri); 066 printf("protocol: %s\n", req.proto); 067 068 return 0; 069 } 重複的動作 將不斷重複的動作取出,並定個有意義的名稱。如此程式不但變小,而且因為 function 名稱所帶來的意涵,加強了程式的可讀性。由此可知,將程式的部分流程變成 function ,並附于適當的名稱,能改善程式的可讀性。 strncspn() 其實就是 strcspn() 的變形,可看 Linux 或 FreeBSD 的 man page 。而 strndup() 就是 strdup() 的變形。透過這些有意義的名稱,程式碼的流程更有意義,提供更多線索,更適合大腦解讀。 001 #include
002 #include
003 #include
004
005 typedef struct {
006 char *method;
007 char *uri;
008 char *proto;
009 } http_req;
010
011 int strncspn(const char *s, int max, const char *charset) {
012 int i, j, cs_sz;
013 char c;
014
015 cs_sz = strlen(charset);
016 for(i = 0; i < max && s[i] != 0; i++) { 017 c = s[i]; 018 for(j = 0; j < cs_sz; j++) { 019 if(c == charset[j]) return i; 020 } 021 } 022 return max; 023 } 024 025 char *strndup(const char *s, int max) { 026 int sz = strlen(s); 027 char *buf; 028 029 if(sz > max) sz = max;
030 buf = (char *)malloc(sz + 1);
031 memcpy(buf, s, sz);
032 buf[sz] = 0;
033
034 return buf;
035 }
036
037 int http_req_parse(http_req *req, const char *buf, int sz) {
038 const char *substr, *last, *next;
039 int substr_sz;
040
041 last = buf + sz;
042
043 substr_sz = strncspn(buf, sz, " \r\n");
044 if(substr_sz == sz || substr_sz == 0 || buf[substr_sz] != ' ')
045 return -1;
046 req->method = strndup(buf, substr_sz);
047
048 substr = buf + substr_sz + 1;
049 substr_sz = strncspn(substr, last - substr, " \r\n");
050 next = substr + substr_sz;
051 if(substr_sz == 0 || next == last || *next != ' ') {
052 free(req->method);
053 return -1;
054 }
055 req->uri = strndup(substr, substr_sz);
056
057 substr = next + 1;
058 substr_sz = strncspn(substr, last - substr, " \r\n");
059 next = substr + substr_sz;
060 if(next != last) {
061 free(req->method);
062 free(req->uri);
063 return -1;
064 }
065 req->proto = strndup(substr, substr_sz);
066
067 return 0;
068 }
069
070 int main(int argc, const char *argv[]) {
071 const char *data = "GET /test.html HTTP/1.1";
072 http_req req;
073
074 if(http_req_parse(&req, data, strlen(data)) < 0) { 075 fprintf(stderr, "error to parse request line!\n"); 076 return 1; 077 } 078 079 printf("request line: %s\n", data); 080 printf("method: %s\n", req.method); 081 printf("uri: %s\n", req.uri); 082 printf("protocol: %s\n", req.proto); 083 084 return 0; 085 } 邏輯拆離 前面的程式,將 parse 的過程中的數個邏輯交叉混合在一起。下面的程式將這些邏輯個別分離,變成獨立的區塊。將相關邏輯集中處理,而非交錯在一起,導至讀者必需不斷的在邏輯之間切換。 另一方面,將邏輯拆離,能減少透過 variable 保留和傳遞資訊的狀況。如第一和第二個程式,透過變數 i 和 substr 保留和傳遞目前處理的狀態,以在 function 前後傳遞資訊。這迫使讀者必需追蹤 variable 的內容,才能理解每一段程式碼的作用和正確性。 而邏輯拆離後,條件判斷也減少了。使用條件式,經常是一種邏輯上的修補行為,對意外狀況的處置。然而大部分情況並非不可避免的,只需適當的安排,將邏輯拆離,既可避免這種修補的動作。 而邏輯拆離後,每一個程式碼區塊的功能也單純化。讀者更易理解,程式撰寫時,也更能確定程式的正確性。 下例,一開始先把確定換行符號是否在字串裡,以排除換行的狀況。接著取得空白符號的位置。最後複製字串,並成 method 、 uri 和 proto 的內容。 001 #include
002 #include
003 #include
004
005 typedef struct {
006 char *method;
007 char *uri;
008 char *proto;
009 } http_req;
010
011 int strncspn(const char *s, int max, const char *charset) {
012 int i, j, cs_sz;
013 char c;
014
015 cs_sz = strlen(charset);
016 for(i = 0; i < max && s[i] != 0; i++) { 017 c = s[i]; 018 for(j = 0; j < cs_sz; j++) { 019 if(c == charset[j]) return i; 020 } 021 } 022 return max; 023 } 024 025 char *strndup(const char *s, int max) { 026 int sz = strlen(s); 027 char *buf; 028 029 if(sz > max) sz = max;
030 buf = (char *)malloc(sz + 1);
031 memcpy(buf, s, sz);
032 buf[sz] = 0;
033
034 return buf;
035 }
036
037 int http_req_parse(http_req *req, const char *buf, int sz) {
038 const char *substr, *last;
039 int i;
040 const char *fss[4];
041
042 sz = strncspn(buf, sz, "\r\n");
043 last = buf + sz;
044
045 substr = buf;
046 for(i = 1; i < 4; i++) { 047 fss[i] = substr + strncspn(substr, last - substr, " "); 048 if(fss[i] == last) break; 049 substr = fss[i] + 1; 050 } 051 if(i != 3) 052 return -1; 053 054 fss[0] = buf; 055 fss[3] = last; 056 for(i = 0; i < 3; i++) { 057 if(i > 0) fss[i]++;
058 if((fss[i + 1] - fss[i]) < 1) 059 return -1; 060 } 061 062 req->method = strndup(fss[0], fss[1] - fss[0]);
063 req->uri = strndup(fss[1], fss[2] - fss[1]);
064 req->proto = strndup(fss[2], fss[3] - fss[2]);
065
066 return 0;
067 }
068
069 int main(int argc, const char *argv[]) {
070 const char *data = "GET /test.html HTTP/1.1";
071 http_req req;
072
073 if(http_req_parse(&req, data, strlen(data)) < 0) { 074 fprintf(stderr, "error to parse request line!\n"); 075 return 1; 076 } 077 078 printf("request line: %s\n", data); 079 printf("method: %s\n", req.method); 080 printf("uri: %s\n", req.uri); 081 printf("protocol: %s\n", req.proto); 082 083 return 0; 084 } 再次簡化 前一個程式將邏輯分離,這個程式將分離後的程式再改進。例如將重用性高的部分再取出,如 strnchrs() 其實就是 strchr 的變形,這樣的 function 功能單純,可重用性高。再加上賦于一個有意義的名稱,增加了程式的可讀性。 另外,把幾個 loop 替換成直接的 statement ,直接的陳述往往比 loop 和 condition 來的易讀。但,如果重複的次數太多時,當然使用 loop 才是合理的狀況。 001 #include
002 #include
003 #include
004
005 typedef struct {
006 char *method;
007 char *uri;
008 char *proto;
009 } http_req;
010
011 int strncspn(const char *s, int max, const char *charset) {
012 int i, j, cs_sz;
013 char c;
014
015 cs_sz = strlen(charset);
016 for(i = 0; i < max && s[i] != 0; i++) { 017 c = s[i]; 018 for(j = 0; j < cs_sz; j++) { 019 if(c == charset[j]) return i; 020 } 021 } 022 return max; 023 } 024 025 char *strndup(const char *s, int max) { 026 int sz = strlen(s); 027 char *buf; 028 029 if(sz > max) sz = max;
030 buf = (char *)malloc(sz + 1);
031 memcpy(buf, s, sz);
032 buf[sz] = 0;
033
034 return buf;
035 }
036
037 int strnchrs(const char *s, int max, int c, const char *chrs[], int chrs_max) {
038 int i, j = 0;
039
040 for(i = 0; i < chrs_max; i++, j++) { 041 for(; j < max; j++) 042 if(s[j] == c) break; 043 if(j == max) break; 044 chrs[i] = s + j; 045 } 046 047 return i; 048 } 049 050 int http_req_parse(http_req *req, const char *buf, int sz) { 051 int i; 052 const char *last; 053 const char *fss[3], *starts[3]; 054 055 sz = strncspn(buf, sz, "\r\n"); 056 last = buf + sz; 057 058 if(strnchrs(buf, sz, ' ', fss, 3) != 2) 059 return -1; 060 061 starts[0] = buf; 062 starts[1] = fss[0] + 1; 063 starts[2] = fss[1] + 2; 064 fss[2] = last; 065 066 for(i = 0; i < 3; i++) 067 if(starts[i] == fss[i]) return -1; 068 069 req->method = strndup(starts[0], fss[0] - starts[0]);
070 req->uri = strndup(starts[1], fss[1] - starts[1]);
071 req->proto = strndup(starts[2], fss[2] - starts[2]);
072
073 return 0;
074 }
075
076 int main(int argc, const char *argv[]) {
077 const char *data = "GET /test.html HTTP/1.1";
078 http_req req;
079
080 if(http_req_parse(&req, data, strlen(data)) < 0) {
081 fprintf(stderr, "error to parse request line!\n");
082 return 1;
083 }
084
085 printf("request line: %s\n", data);
086 printf("method: %s\n", req.method);
087 printf("uri: %s\n", req.uri);
088 printf("protocol: %s\n", req.proto);
089
090 return 0;
091 }
行數
表面上看來,改進可讀性之後,程式碼的行數突然大增。但是如果仔細算一下有意義的 statement 數目,其實是減少的。之所以會有大增的表象,是來自於 function 、 變數的宣告 、呼叫和留白。但這些是否該例入程式的複雜度裡?本人採取否定的態度。我的計算方法,是將每一個 assign statement 算一行,每一個 for 、 if 、 break 、 return 、 continue 算一行。如此
if(...) break;
會算兩行。而 variable 和 function 的宣告不列入計算。
仔細算下來,最後一個程式比第一個程式還少上數行。
執行檔的大小
-rwxr-xr-x 1 thinker users 6117 May 6 00:06 readability0
-rwxr-xr-x 1 thinker users 6161 May 5 20:58 readability1
-rwxr-xr-x 1 thinker users 6064 May 5 21:30 readability2
-rwxr-xr-x 1 thinker users 6125 May 6 00:06 readability3
依序為前面四個程式的大小。第四個程式的成本是區區數個 bytes ,但可讀性卻大幅的改善。另一方面,本程式規模較小,大程式的的重複性更高,增加可讀性可能會使程式比原來更小。程式愈大,其效能愈大。
結論
程式的可讀性,來自於有意義的名稱、邏輯的分割。而可讀性卻不會增加程式的大小,反而能使程式更精簡、更小、甚至更快。本文並沒有討論到模組的計設和資訊封裝等議題,單就程式流程的可讀性進行討論。
Thursday, October 29, 2009
程式設計注意事項 - 2008.08.17 update
程式設計注意事項 - 2008.08.17 update
by thinker
2 Columns
關鍵字:
coding
幾個注意事項,隨時更新
這些只是建議性原則,將日常遇到的原則問題和解決方法表列。曾在別人的 blog 看到有人提起這篇 memo,表示無法了解規則背後的理由。規則陳述句會盡量精簡,所以並不是每個人都可以看懂。也許像推背圖,事情發生時就會晃然大悟,原來如此 :p 當然,這個 memo 不像推背圖精深、博大。
* 變動和影嚮
o class 或 module 之間,必需依靠 interface 溝通。
o 依靠其它 class 或 module 的結構,容易受其結構變動而影嚮。
o interface 容易適應(adapt) implement 的改變; interface 不變。
o 將 class 或 module 之間的相關性,集中和局限於小範圍,易於變動。
o interface 透過適應程序,能集中和局限因變動而受影嚮的範圍。
o 先確定 class 、 module 或 interface 的關聯,再寫程式。 2007.02.23
o 關聯圖,不離手。 2007.02.23
o 避免、減少 interface 的間接曝露。 2007.02.23
* Closure 代替 function object
o 使用 closure 比 function object 好。
o closure 能減少程式碼。
o closure 將 callback 所需資料,集中在小範圍的 source code 內 (static scope)。
o closure 改善閱讀性。
* 功能切割
o 每個 class 維護單一目標所需的資料。
o function 或 method 提供單一目的。
o 達成目的所需的相關動作,各自分割成 function 、 method 或 class。
+ 資料加入資料庫之後,index 必需更新,則加入動作呼叫 index 更新的 function 或 method,而非全由加入動作獨力完成。
o 若 class 提供大量 callback 的 method, 可考慮以一個或多個 interface 組織 callback methods. 形成 top-down 形式, 較簡單的 topology. (update: 2006.11.28)
* 命名規則
o function 名稱必需有動詞
o function parameter 是名詞
o parameter 出現的順序依照 function 名稱中出現的次序
o 名詞可被省略,但 parameter 出現的次序應該依照原本的次序
o method 的 bond object 是主詞
* 語法使用
o 一行一動作, 一行多動作不利於 debug
o 一行一 function, 呼叫多個 function 不易 debug (錯在哪個 function?)
o 限制括號的層數,不論是大、中、小括號。
+ 太多層的括號,可將之中間的計算過程拆成數行,以利於閱讀。
by thinker
2 Columns
關鍵字:
coding
幾個注意事項,隨時更新
這些只是建議性原則,將日常遇到的原則問題和解決方法表列。曾在別人的 blog 看到有人提起這篇 memo,表示無法了解規則背後的理由。規則陳述句會盡量精簡,所以並不是每個人都可以看懂。也許像推背圖,事情發生時就會晃然大悟,原來如此 :p 當然,這個 memo 不像推背圖精深、博大。
* 變動和影嚮
o class 或 module 之間,必需依靠 interface 溝通。
o 依靠其它 class 或 module 的結構,容易受其結構變動而影嚮。
o interface 容易適應(adapt) implement 的改變; interface 不變。
o 將 class 或 module 之間的相關性,集中和局限於小範圍,易於變動。
o interface 透過適應程序,能集中和局限因變動而受影嚮的範圍。
o 先確定 class 、 module 或 interface 的關聯,再寫程式。 2007.02.23
o 關聯圖,不離手。 2007.02.23
o 避免、減少 interface 的間接曝露。 2007.02.23
* Closure 代替 function object
o 使用 closure 比 function object 好。
o closure 能減少程式碼。
o closure 將 callback 所需資料,集中在小範圍的 source code 內 (static scope)。
o closure 改善閱讀性。
* 功能切割
o 每個 class 維護單一目標所需的資料。
o function 或 method 提供單一目的。
o 達成目的所需的相關動作,各自分割成 function 、 method 或 class。
+ 資料加入資料庫之後,index 必需更新,則加入動作呼叫 index 更新的 function 或 method,而非全由加入動作獨力完成。
o 若 class 提供大量 callback 的 method, 可考慮以一個或多個 interface 組織 callback methods. 形成 top-down 形式, 較簡單的 topology. (update: 2006.11.28)
* 命名規則
o function 名稱必需有動詞
o function parameter 是名詞
o parameter 出現的順序依照 function 名稱中出現的次序
o 名詞可被省略,但 parameter 出現的次序應該依照原本的次序
o method 的 bond object 是主詞
* 語法使用
o 一行一動作, 一行多動作不利於 debug
o 一行一 function, 呼叫多個 function 不易 debug (錯在哪個 function?)
o 限制括號的層數,不論是大、中、小括號。
+ 太多層的括號,可將之中間的計算過程拆成數行,以利於閱讀。
簡單的藝術
簡單的藝術
by thinker
2 Columns
關鍵字:
coding 軟體工程師
一般人,攏說咱是一陣軟體工程師。工程師的工作是在有現的時間加預算內,達成目標,無論方法。咁講,咁若達成目標這呢簡單。品質也是非常重要的因素,軟體的品質包括正確性、執行的效率和維護的成本。為著改善軟體的品質,軟體工程師,必需是藝家。藝術是以「巧妙」的方法,完整的傳達個人的想法和理念。「巧妙」是啥?「巧妙」是乎郎讚嘆的方法。咱用四兩撥千斤,這是「巧妙」。咱用大象搬樹,這就毋是「巧妙」。咱用「犀牛望月」就比「踮這一直等」來得「巧妙」。「巧妙」的方法,乎咱用簡單,又正確的方法,傳達咱的意念。寫程式也要同款。
programmer 若只是工程師,咱咁若會曉完成功能,但是得不著好的品質。好的品質,也就是用「巧妙」的方法完成工作。「巧妙」的方法毋是乎郎看無,也毋是一支嘴透腸仔底的方法。「巧妙」是用容易瞭解的方法,完成簡短、正確的結果。若是偏離了「容易瞭解」或者是「簡短、正確」,這就毋是「巧炒」。
譬如講,base64 的解碼演算法,查表就比 if-else 巧妙:
static const char Base64[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
static const char Pad64 = '=';
......... skip ............
while (2 < srclength) {
input[0] = *src++;
input[1] = *src++;
input[2] = *src++;
srclength -= 3;
output[0] = input[0] >> 2;
output[1] = ((input[0] & 0x03) << 4) + (input[1] >> 4);
output[2] = ((input[1] & 0x0f) << 2) + (input[2] >> 6);
output[3] = input[2] & 0x3f;
target[datalength++] = Base64[output[0]];
target[datalength++] = Base64[output[1]];
target[datalength++] = Base64[output[2]];
target[datalength++] = Base64[output[3]];
}
.................. skip ...........
那是將查表的部分,改成用 if-else 的 condition,會使得程式更加複雜,難了解。雖然查表的方法,看起憨憨,要將每一個字母攏寫出,但是對於讀程式碼的人來說,一看就知,容易了解,而且無影嚮著執行的效率。這就是「巧妙」。
將:
output[0] = input[0] >> 2;
...... cut ....
target[datalength++] = Base64[output[0]];
...... cut ....
改成:
target[datalength++] == Base64[input[0] >> 2];
並無恰「巧妙」,顛倒是每一行所內含的意義變多,程式較無容易了解。原來的方法可能恰多行,但是對讀的人來講,簡單多啊! 你可能會認為,後壁的方法己經很習慣啊! 但是你要考慮著,並毋是每一個人攏習慣用這種方法。Keep It Simple and Stupid (KISS) 也就這個意思,盡量用簡單的方法寫你的 code。
所謂的簡單,是講讀者容易了解的方法。踮這咧 OO 的年代,真多人對著繼承 (Inherent) 有著很大的寄望,以為盡量繼承,就會使減少程式碼的量,更加容易維護。這致使真多郎,毋管什麼麼件,只要有同款的部份,攏會繼承。繼承會講求目的和理念。若是沒一套合理的目的和理念,繼承就沒脈絡也尋。歸個繼承系統,也因此難以了解。所以,繼承也愛講求「巧妙」,巧妙的繼承,是將物件依照一定的道理分類,而不是依照程式碼的重複的程度來繼承。依照程式碼的重複的程度來做繼承,會使著繼承系統無脈絡可尋,讀者也就難以了解。繼承也毋通太深。太深的繼承,會使著物件難以了解。對讀者來講,要了解一個物件的行為,就要追蹤繼承樹上每一個被繼承的 class。太深的繼承,對讀者而言,是一種負擔,會使得程式難以維護。這款情形,就算是設計者,在離開一段時間後,回頭看自己設計出的麼件,也會霧煞煞!
Indent (縮排)對程式也是很重要的,哪是沒 indent,大部分程式攏難以了解。「巧妙」的程式,也要有「巧妙」的 indent。我建議最少四個空白以上的 indent。有一寡人是不知 indent 的,有的人很凍酸,縮白是一二個空白的寬度。indent 是會乎讀者容易了解,也乎自己容易 debug,減少錯誤。曾經有一個朋友,伊攏用一個空白做 indent。伊講按哪才會使在有限的空間內,擠恰多麼件。我講按咧恰容易發生錯誤,伊講不。我進一步問,發現伊其實也常常看著毋對行。一來,咱無應該將每一行程式寫太長。二來,槽狀也毋通太深。三來,四格以上的寬度,恰不看毋對。
程式內的註解,嘛要「巧妙」。多役的說明,只會困擾著讀者。其實,當著開始看程式的內容的時陣,咱只會注意著程式的本文,註解的部份會自動乎咱跳過,不去注意著。多役的說明,只會打擾著讀者,讀者希望可以在一個畫面內,盡量看著恰多的資料。按咧安排的程式,讀者嘛讀恰順,恰容易了解。所以,註解盡量短小、簡單,毋通乞吃趕廟公,顛倒打擾著讀者的節奏。
除了程式本身之外,架構(河洛話要按怎寫恰好?)的設計也要「巧妙」。巧妙的設計,用簡單的架構來解決複雜的問題。簡單的設計,乎人容易了解。要設計出複雜的麼件並無困難,按怎找出簡單的設計,解決複雜的問題,才是「巧妙」。可以使用 client-server 解決的問題,就無需要用 peer-to-peer。通使用 hash table 解決的問題,就無需要用 rb-tree 去解決。一切,以著簡單為水。
一切攏要保持簡單,保持簡單就是一種藝術。藝術家,所追求的就是「巧妙」的傳達伊的意念。嘛要有藝術的精神,才能求得最高境界的「巧妙」。也著是講,好的 programmer 必需是一個藝術家,以著「巧妙」的簡單藝術手法,傳達你的意念乎看程式的人。
by thinker
2 Columns
關鍵字:
coding 軟體工程師
一般人,攏說咱是一陣軟體工程師。工程師的工作是在有現的時間加預算內,達成目標,無論方法。咁講,咁若達成目標這呢簡單。品質也是非常重要的因素,軟體的品質包括正確性、執行的效率和維護的成本。為著改善軟體的品質,軟體工程師,必需是藝家。藝術是以「巧妙」的方法,完整的傳達個人的想法和理念。「巧妙」是啥?「巧妙」是乎郎讚嘆的方法。咱用四兩撥千斤,這是「巧妙」。咱用大象搬樹,這就毋是「巧妙」。咱用「犀牛望月」就比「踮這一直等」來得「巧妙」。「巧妙」的方法,乎咱用簡單,又正確的方法,傳達咱的意念。寫程式也要同款。
programmer 若只是工程師,咱咁若會曉完成功能,但是得不著好的品質。好的品質,也就是用「巧妙」的方法完成工作。「巧妙」的方法毋是乎郎看無,也毋是一支嘴透腸仔底的方法。「巧妙」是用容易瞭解的方法,完成簡短、正確的結果。若是偏離了「容易瞭解」或者是「簡短、正確」,這就毋是「巧炒」。
譬如講,base64 的解碼演算法,查表就比 if-else 巧妙:
static const char Base64[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
static const char Pad64 = '=';
......... skip ............
while (2 < srclength) {
input[0] = *src++;
input[1] = *src++;
input[2] = *src++;
srclength -= 3;
output[0] = input[0] >> 2;
output[1] = ((input[0] & 0x03) << 4) + (input[1] >> 4);
output[2] = ((input[1] & 0x0f) << 2) + (input[2] >> 6);
output[3] = input[2] & 0x3f;
target[datalength++] = Base64[output[0]];
target[datalength++] = Base64[output[1]];
target[datalength++] = Base64[output[2]];
target[datalength++] = Base64[output[3]];
}
.................. skip ...........
那是將查表的部分,改成用 if-else 的 condition,會使得程式更加複雜,難了解。雖然查表的方法,看起憨憨,要將每一個字母攏寫出,但是對於讀程式碼的人來說,一看就知,容易了解,而且無影嚮著執行的效率。這就是「巧妙」。
將:
output[0] = input[0] >> 2;
...... cut ....
target[datalength++] = Base64[output[0]];
...... cut ....
改成:
target[datalength++] == Base64[input[0] >> 2];
並無恰「巧妙」,顛倒是每一行所內含的意義變多,程式較無容易了解。原來的方法可能恰多行,但是對讀的人來講,簡單多啊! 你可能會認為,後壁的方法己經很習慣啊! 但是你要考慮著,並毋是每一個人攏習慣用這種方法。Keep It Simple and Stupid (KISS) 也就這個意思,盡量用簡單的方法寫你的 code。
所謂的簡單,是講讀者容易了解的方法。踮這咧 OO 的年代,真多人對著繼承 (Inherent) 有著很大的寄望,以為盡量繼承,就會使減少程式碼的量,更加容易維護。這致使真多郎,毋管什麼麼件,只要有同款的部份,攏會繼承。繼承會講求目的和理念。若是沒一套合理的目的和理念,繼承就沒脈絡也尋。歸個繼承系統,也因此難以了解。所以,繼承也愛講求「巧妙」,巧妙的繼承,是將物件依照一定的道理分類,而不是依照程式碼的重複的程度來繼承。依照程式碼的重複的程度來做繼承,會使著繼承系統無脈絡可尋,讀者也就難以了解。繼承也毋通太深。太深的繼承,會使著物件難以了解。對讀者來講,要了解一個物件的行為,就要追蹤繼承樹上每一個被繼承的 class。太深的繼承,對讀者而言,是一種負擔,會使得程式難以維護。這款情形,就算是設計者,在離開一段時間後,回頭看自己設計出的麼件,也會霧煞煞!
Indent (縮排)對程式也是很重要的,哪是沒 indent,大部分程式攏難以了解。「巧妙」的程式,也要有「巧妙」的 indent。我建議最少四個空白以上的 indent。有一寡人是不知 indent 的,有的人很凍酸,縮白是一二個空白的寬度。indent 是會乎讀者容易了解,也乎自己容易 debug,減少錯誤。曾經有一個朋友,伊攏用一個空白做 indent。伊講按哪才會使在有限的空間內,擠恰多麼件。我講按咧恰容易發生錯誤,伊講不。我進一步問,發現伊其實也常常看著毋對行。一來,咱無應該將每一行程式寫太長。二來,槽狀也毋通太深。三來,四格以上的寬度,恰不看毋對。
程式內的註解,嘛要「巧妙」。多役的說明,只會困擾著讀者。其實,當著開始看程式的內容的時陣,咱只會注意著程式的本文,註解的部份會自動乎咱跳過,不去注意著。多役的說明,只會打擾著讀者,讀者希望可以在一個畫面內,盡量看著恰多的資料。按咧安排的程式,讀者嘛讀恰順,恰容易了解。所以,註解盡量短小、簡單,毋通乞吃趕廟公,顛倒打擾著讀者的節奏。
除了程式本身之外,架構(河洛話要按怎寫恰好?)的設計也要「巧妙」。巧妙的設計,用簡單的架構來解決複雜的問題。簡單的設計,乎人容易了解。要設計出複雜的麼件並無困難,按怎找出簡單的設計,解決複雜的問題,才是「巧妙」。可以使用 client-server 解決的問題,就無需要用 peer-to-peer。通使用 hash table 解決的問題,就無需要用 rb-tree 去解決。一切,以著簡單為水。
一切攏要保持簡單,保持簡單就是一種藝術。藝術家,所追求的就是「巧妙」的傳達伊的意念。嘛要有藝術的精神,才能求得最高境界的「巧妙」。也著是講,好的 programmer 必需是一個藝術家,以著「巧妙」的簡單藝術手法,傳達你的意念乎看程式的人。
GDB 自動化 debug
GDB 自動化 debug
by thinker
[url=javascript: init_mcol()]2 Columns [/url]
關鍵字:
coding
近來,有很多人都習慣寫 unittest ,確保程式的正確性。我也是 unittest 的愛用者。然而, unittest 主要針對模組進行單元性的測試,但有些整合性問題,若要透過一般 unittest 的工具,勢必需要在程式裡埋一些 run time 用不到的 code,進行一些記錄。例如,你想知道程式執行時,是否按某種規則、協定呼叫模組。按一般做法,大概必需在模組的進入點,埋下一些程式碼,檢查呼叫的次序和流程是否依照協定。這類的 code ,通常很雜亂,因為檢查的 function 數量可能不少,進而邏輯很分散。如果能有一種方法,能將這些邏輯集中,那不是很美滿嗎? 這時 GDB 加上 scripting 就能幫上忙。
舉例,我在刻 MadButterfly 時,想要知道畫面上某個物件移動時,是否只重畫周圍相關的圖形,而不是整個畫面。這時,我可以透過 gdb ,埋一些 break point ,並加上 scripting 的指令,記錄每一次物件移動時,有多少相關的圖形被重畫。於是我在 gdb prompt 下指令
set pagination off
break object_move
commands
silent
set $draw_cnt=0
continue
end
break object_draw
commands
silent
set $draw_cnt = $draw_cnt + 1
continue
end
break where_redraw_completed
commands
silent
print $draw_cnt
continue
end
run
commands 指令,會 attach 到前一個 break 指令設定的 break point 。當 break point 被觸發時,commands 和 end 之間的指令就會被執行,就像你在 prompt 下指令一樣。於是這些指令會在 break point 被觸發時,自動被執行,包括 continue 指令。由於最後一個指令是 continue ,於是每當那些 break point 被觸發,程式並不會在執行這些指令後停止,而是繼續執行 (continue)。 然而,每個 break point 所 attach 的指令,也包括了 silent 這個指令,這能避免 gdb 在 break point 觸發之後,自動秀出一些狀態、訊息。而 set pagination off 則能避免 gdb 將 print 出來的資料,進行像 more 指令一樣的分頁。若不關掉 pagination , gdb 會因為等待使用者按下一頁,而使程式暫停。就這樣,程式能不斷的執行,而我們又能監看 update 的數量。
相同的技術,也可用於執行整合性的 test case 。 gdb 的 command 也包括一些 if-else 、 while 等指令。(請見 ) 而這些指令也能寫成一個文字檔,當成 script 執行。透過這樣的技巧,希望也能對你的開發工作有所幫助。
by thinker
[url=javascript: init_mcol()]2 Columns [/url]
關鍵字:
coding
近來,有很多人都習慣寫 unittest ,確保程式的正確性。我也是 unittest 的愛用者。然而, unittest 主要針對模組進行單元性的測試,但有些整合性問題,若要透過一般 unittest 的工具,勢必需要在程式裡埋一些 run time 用不到的 code,進行一些記錄。例如,你想知道程式執行時,是否按某種規則、協定呼叫模組。按一般做法,大概必需在模組的進入點,埋下一些程式碼,檢查呼叫的次序和流程是否依照協定。這類的 code ,通常很雜亂,因為檢查的 function 數量可能不少,進而邏輯很分散。如果能有一種方法,能將這些邏輯集中,那不是很美滿嗎? 這時 GDB 加上 scripting 就能幫上忙。
舉例,我在刻 MadButterfly 時,想要知道畫面上某個物件移動時,是否只重畫周圍相關的圖形,而不是整個畫面。這時,我可以透過 gdb ,埋一些 break point ,並加上 scripting 的指令,記錄每一次物件移動時,有多少相關的圖形被重畫。於是我在 gdb prompt 下指令
set pagination off
break object_move
commands
silent
set $draw_cnt=0
continue
end
break object_draw
commands
silent
set $draw_cnt = $draw_cnt + 1
continue
end
break where_redraw_completed
commands
silent
print $draw_cnt
continue
end
run
commands 指令,會 attach 到前一個 break 指令設定的 break point 。當 break point 被觸發時,commands 和 end 之間的指令就會被執行,就像你在 prompt 下指令一樣。於是這些指令會在 break point 被觸發時,自動被執行,包括 continue 指令。由於最後一個指令是 continue ,於是每當那些 break point 被觸發,程式並不會在執行這些指令後停止,而是繼續執行 (continue)。 然而,每個 break point 所 attach 的指令,也包括了 silent 這個指令,這能避免 gdb 在 break point 觸發之後,自動秀出一些狀態、訊息。而 set pagination off 則能避免 gdb 將 print 出來的資料,進行像 more 指令一樣的分頁。若不關掉 pagination , gdb 會因為等待使用者按下一頁,而使程式暫停。就這樣,程式能不斷的執行,而我們又能監看 update 的數量。
相同的技術,也可用於執行整合性的 test case 。 gdb 的 command 也包括一些 if-else 、 while 等指令。(請見 ) 而這些指令也能寫成一個文字檔,當成 script 執行。透過這樣的技巧,希望也能對你的開發工作有所幫助。
GDB 追蹤 memory corruption
GDB 追蹤 memory corruption
by thinker
2 Columns
關鍵字:
coding
在不久之前,在 GDB 自動化 debug 這篇文章中,簡單提到如何用 script 控制 gdb 。而今天又因為 MadButterfly 的範例程式裡的某個 bug ,再次使用相同的技巧。這次遇到的是一種典型的錯誤,某塊記憶體不知何時被修改,造成程式錯誤。
我將問題簡化一下。有一個變數 V,在程式一啟動之後就會 initialize 成特定值 k。但因未知原因,結果 V 不知何時被改成 m。為了要查出倒底是誰改了 V 的值,一般的做法是檢視程式流程,找出所有可能會改到 V 的地方。但有時侯根本是 pointer 亂指,不小心改到該變數。這時肉眼檢視的方法就不太可行了。這時,最重要的是找出程式是執行到什麼步驟時,該變數會被改變。
如果以一般的做法,就是到處 printf() ,然後不斷的 compile 、重新執行。這很沒有效率,因為無法動態檢視其它變數,以進行比較。比較有效的方法是,在該變數被修改時能自動讓 gdb 停下來。這時,在 gdb 設定 watchpoint 就可以達到這個目的。但,有些平台的 hardware 並不支援 watchpoint ,這時 gdb 會以連續 single step 的方式執行,以達到 watchpoint 的效果。但這會非常非常的慢。如果程式複雜的話,這幾乎不可行。這時,透過適當的設定 break point ,更能快速的鎖定問題的發生點。
做法是,先把程式執行流程中,可疑的地點都設定 break point 如下
break xxx.c:123 if V != k
這時,這個 break 會在 xxx.c:123 的位置,檢查變數 V 是否為 k。若不為 k ,則程式會暫停執行,否則 break point 不會發生效果。這時,你可以到處設定 break point ,將問題鎖定在一個範圍,並不斷的縮小範圍,直到找到特定的位置。這方式的好處是你可以快速的設定,然後再重新執行。並且可檢視 process 的狀態,以決定新的 break point 要加在哪。
有時侯,你要監看的記憶體是動態配置的。因此,你必需在配置時才知道其位址。這時,可在配置該記憶體的指令設定 break point ,並記下配置的位址。
break xxx.c:567
commands
silent
set $addr=p
continue
end
假定,在 xxx.c:567 的前一個指令,會將配置到的記憶體位址存在 local variable p。那麼上面的設定可以將該位址存在 gdb 的 convenience variable $addr 裡。那麼我們就可以在其它 break point 裡檢查該位址的內容。例如
break xxx.c:123 if *$addr != k
就能在程式執行到 xxx.c:123 位置時,檢查 $addr 所指的記憶體位址內容。
這些設定,可以寫在一個文字檔,例如 foo.txt
break xxx.c:567
commands
.......
end
那麼我們就可以透過 gdb 參數,不斷重複這些設定
gdb -x foo.txt your_progm
by thinker
2 Columns
關鍵字:
coding
在不久之前,在 GDB 自動化 debug 這篇文章中,簡單提到如何用 script 控制 gdb 。而今天又因為 MadButterfly 的範例程式裡的某個 bug ,再次使用相同的技巧。這次遇到的是一種典型的錯誤,某塊記憶體不知何時被修改,造成程式錯誤。
我將問題簡化一下。有一個變數 V,在程式一啟動之後就會 initialize 成特定值 k。但因未知原因,結果 V 不知何時被改成 m。為了要查出倒底是誰改了 V 的值,一般的做法是檢視程式流程,找出所有可能會改到 V 的地方。但有時侯根本是 pointer 亂指,不小心改到該變數。這時肉眼檢視的方法就不太可行了。這時,最重要的是找出程式是執行到什麼步驟時,該變數會被改變。
如果以一般的做法,就是到處 printf() ,然後不斷的 compile 、重新執行。這很沒有效率,因為無法動態檢視其它變數,以進行比較。比較有效的方法是,在該變數被修改時能自動讓 gdb 停下來。這時,在 gdb 設定 watchpoint 就可以達到這個目的。但,有些平台的 hardware 並不支援 watchpoint ,這時 gdb 會以連續 single step 的方式執行,以達到 watchpoint 的效果。但這會非常非常的慢。如果程式複雜的話,這幾乎不可行。這時,透過適當的設定 break point ,更能快速的鎖定問題的發生點。
做法是,先把程式執行流程中,可疑的地點都設定 break point 如下
break xxx.c:123 if V != k
這時,這個 break 會在 xxx.c:123 的位置,檢查變數 V 是否為 k。若不為 k ,則程式會暫停執行,否則 break point 不會發生效果。這時,你可以到處設定 break point ,將問題鎖定在一個範圍,並不斷的縮小範圍,直到找到特定的位置。這方式的好處是你可以快速的設定,然後再重新執行。並且可檢視 process 的狀態,以決定新的 break point 要加在哪。
有時侯,你要監看的記憶體是動態配置的。因此,你必需在配置時才知道其位址。這時,可在配置該記憶體的指令設定 break point ,並記下配置的位址。
break xxx.c:567
commands
silent
set $addr=p
continue
end
假定,在 xxx.c:567 的前一個指令,會將配置到的記憶體位址存在 local variable p。那麼上面的設定可以將該位址存在 gdb 的 convenience variable $addr 裡。那麼我們就可以在其它 break point 裡檢查該位址的內容。例如
break xxx.c:123 if *$addr != k
就能在程式執行到 xxx.c:123 位置時,檢查 $addr 所指的記憶體位址內容。
這些設定,可以寫在一個文字檔,例如 foo.txt
break xxx.c:567
commands
.......
end
那麼我們就可以透過 gdb 參數,不斷重複這些設定
gdb -x foo.txt your_progm
Dalvik 程式碼分析與示範(二)
Dalvik 程式碼分析與示範(二)
by thinker
2 Columns
關鍵字:
Android
論軟體, Dalvik 算小物,但也非一時三刻能說的完。前篇談到 Dalvik 建 gDvm ,至此算是完成初始化。可開始執行 bytecode。Dalvik 在功能劃分算是明顯, vm/Jni.c 透過 JavaVM 和 JNIEnv ,提供 user 功能介面,一方面則保全內部細節,不為外視。然而,別忘了初衷,我們欲了解 VM 的運作,至此只是摸清了外觀。而 VM 內部功能如銀河繁星,無法細數。必先擇一目標,集中分析,才不致於迷罔於程式碼間。對分析 VM 而言,我們最想知道,也最感興趣的,無非是執行 bytecode 。我們先直擊 bytecode interpreter。
Interpreter
在 gNativeInterface 裡列出了許多 CallXXXMethod() 之類的 function ,想必是通往 interpreter 的直達車。而在 main() function 裡,則是呼叫了(*evn)->CallStaticVoidMethod() 以執行 class 的 main。因此,直接朝 CallStaticVoidMethod() 下手也是合情合理。此卻發現,找不到 CallStaticVoidMethod() 在何處定義。這通常意謂著,一、CallStaticVoidMethod() 是透過巨集定義,二、由其它工具在 build 時產生。兩者都使的找不到定義 CallStaticVoidMethod() 的位置,這時可搜尋部分字串,即可發現其定義位置。
此列中,我發覺 CallXXXXMethod() 中的 XXXX 有好幾個版本。應該是有個集巨,以 XXXX 為參數,產生這些 function 。於是我挑了 StaticVoid 進行 grep,結果找不到。於是我又試了 Void 當作關鍵字,終於發現 Jni.c 裡,有幾個地方呼叫了巨集,並以 Void 當作參數。於是發現了
001 CALL_STATIC(void, Void, , , false);
應為定義 CallStaticVoidMethod() 的位置。又查
001 #define CALL_STATIC(_ctype, _jname, _retfail, _retok, _isref) \
002 static _ctype CallStatic##_jname##Method(JNIEnv* env, jclass clazz, \
003 jmethodID methodID, ...) \
004 { \
005 JNI_ENTER(); \
006 JValue result; \
007 va_list args; \
008 assert((ClassObject*) clazz == ((Method*)methodID)->clazz); \
009 va_start(args, methodID); \
010 dvmCallMethodV(_self, (Method*) methodID, NULL, &result, args); \
011 va_end(args); \
012 if (_isref) \
013 result.l = addLocalReference(result.l); \
014 JNI_EXIT(); \
015 return _retok; \
016 } \
017 ....
確實定義了這些 function 。這是 C programmer 常用的手法,利用 macro 定義其它 function 。並以參數做為 function 名稱的一部分。其中 '##' 則是將字串分隔開來,在 expansion 之後,將結果組合在一起。如上例, _jname 若為 Void ,則第一行組出 CallStaticVoidMethod 名稱。
從字面上,我們能猜的出, dvmCallMethodV() 應該和執行 bytecode 關係密切。這時,在 Emacs 再按下 alt+. ,直接跳到 dvmCallMethodV() 的第一行,是在 vm/interp/Stack.c 檔案裡。看來這個 function 和 stack 有關。快速描視一次 function ,會發現幾個引人注意的字眼
* dvmIsNativeMethod()
o (*method->nativeFunc)(...)
o dvmInterpret()
* dvmPopFrame()
dvmIsNativeMethod() 分明就是判斷 method 是否為 native method (JNI),若是 native 則呼叫 method 的 nativeFunc()。若非 native 則直接呼叫 dvmInterpret(),在此眼冒火光,「Interpret」 關鍵字頂上罩著一片祥雲。dvmIsNativeMethod() 暗示了這是一個狀態的測試,「IS」是一個常見的關鍵字,一般我們會用 IsBusy() 、 IsClear() .... IsXXX() 命名測試物件、系統狀態的巨集、函數。其實,多看別人的 code 是一項非常重要的修練。雖然很多 programmer 並不看別人的 code ,但每個人都閉門造車,開發自已的習慣,那必定南轅北轍,如何能夠被了解。多看別人的 code ,潛移默化之下,自然能寫出更易於閱讀的程式碼。
dvmPopFrame() 暗示我們前面應該有個地方會造一個 frame。frame 是程式在呼叫函數的過程,在 stack 裡為該每次呼叫所保留的一塊空間,以儲存 function 狀態。這應該不用我多說,不知道可以去 search。search 還看不懂,表示要多讀點書。回到正題,往回一看,在function 的前面有一行
001 clazz = callPrep(self, method, obj, false);
call + Prep ,好像有準備 frame 的意涵,進去一看,在 callPrep() 頂頭的註解裡,明白寫了
* Pushes a call frame on, advancing self->curFrame.
看來是猜的沒錯。但我們暫時對 frame 沒興趣,回到 dvmCallMethodV()。
於是我們直接看 dvmInterpret() 的內容,在 vm/interp/Interp.c 裡。我們注意到幾個發亮的大字
* interpState
* stdInterp
o dvmMterpStd
o dvmInterpretStd
InterpState 看起來就是準備 interpret 所需的資料,有
* method
* fp
* pc
* entryPoint
個個看起來皆是和 Interpret 執行的狀態相關。往下看,stdInterp 依據 gDvm 的內容,決定是 dvmMterpStd 或 dvmInterpretStd 。而在後面看到
001 change = (*stdInterp)(self, &interpState);
看來 stdInterp 應該就是 Interpret 的進入點。並且有兩種 Interpreter 可以選。另外還有一個 debug 的版本 dvmInterpretDbg() ,但目前對 debug 不感興趣,略過。
若查一下 dvmInterpretStd() 和 dvmInterpretStd() 的位置,會發現都在 vm/mterp/ 目錄之下。dvmInterpretStd() 使用 ctags 找不到,於是用 grep 試試,會發現
001 ./mterp/out/InterpC-portstd.c:#define INTERP_FUNC_NAME dvmInterpretStd
002 ./mterp/portable/portstd.c:#define INTERP_FUNC_NAME dvmInterpretStd
在 vm/mterp/ 下的兩個檔案,用巨集 alias 這個符號,應該是有其它地方使用 INTERP_FUNC_NAME 定義了 function 內容。
之前聽說了, Dalvik 針對不同的平台,有特別用 assembly 進行最佳化。 dvmInterpretStd 和 dvmMpretStd 兩個版本的 interpreter 應該和此有關。觀察一下 vm/mterp 目錄下,有
* armv4
* armv5te
* x86
* ...
等平台有關的字眼,看是所言不假。
在 trace code 時,其實用到了許多直覺。直覺有高低之分,差異來自於平常知識累積。只要 trace 時,保靈活的聯想力,特殊字眼自然會和腦中知識聯結,進而發生直覺。這些知識有時不是來自於書本,道聽途說也可能變成有用的知識。像是前面提到針對不同 CPU 進行 optimize 的傳言。
待續......
by thinker
2 Columns
關鍵字:
Android
論軟體, Dalvik 算小物,但也非一時三刻能說的完。前篇談到 Dalvik 建 gDvm ,至此算是完成初始化。可開始執行 bytecode。Dalvik 在功能劃分算是明顯, vm/Jni.c 透過 JavaVM 和 JNIEnv ,提供 user 功能介面,一方面則保全內部細節,不為外視。然而,別忘了初衷,我們欲了解 VM 的運作,至此只是摸清了外觀。而 VM 內部功能如銀河繁星,無法細數。必先擇一目標,集中分析,才不致於迷罔於程式碼間。對分析 VM 而言,我們最想知道,也最感興趣的,無非是執行 bytecode 。我們先直擊 bytecode interpreter。
Interpreter
在 gNativeInterface 裡列出了許多 CallXXXMethod() 之類的 function ,想必是通往 interpreter 的直達車。而在 main() function 裡,則是呼叫了(*evn)->CallStaticVoidMethod() 以執行 class 的 main。因此,直接朝 CallStaticVoidMethod() 下手也是合情合理。此卻發現,找不到 CallStaticVoidMethod() 在何處定義。這通常意謂著,一、CallStaticVoidMethod() 是透過巨集定義,二、由其它工具在 build 時產生。兩者都使的找不到定義 CallStaticVoidMethod() 的位置,這時可搜尋部分字串,即可發現其定義位置。
此列中,我發覺 CallXXXXMethod() 中的 XXXX 有好幾個版本。應該是有個集巨,以 XXXX 為參數,產生這些 function 。於是我挑了 StaticVoid 進行 grep,結果找不到。於是我又試了 Void 當作關鍵字,終於發現 Jni.c 裡,有幾個地方呼叫了巨集,並以 Void 當作參數。於是發現了
001 CALL_STATIC(void, Void, , , false);
應為定義 CallStaticVoidMethod() 的位置。又查
001 #define CALL_STATIC(_ctype, _jname, _retfail, _retok, _isref) \
002 static _ctype CallStatic##_jname##Method(JNIEnv* env, jclass clazz, \
003 jmethodID methodID, ...) \
004 { \
005 JNI_ENTER(); \
006 JValue result; \
007 va_list args; \
008 assert((ClassObject*) clazz == ((Method*)methodID)->clazz); \
009 va_start(args, methodID); \
010 dvmCallMethodV(_self, (Method*) methodID, NULL, &result, args); \
011 va_end(args); \
012 if (_isref) \
013 result.l = addLocalReference(result.l); \
014 JNI_EXIT(); \
015 return _retok; \
016 } \
017 ....
確實定義了這些 function 。這是 C programmer 常用的手法,利用 macro 定義其它 function 。並以參數做為 function 名稱的一部分。其中 '##' 則是將字串分隔開來,在 expansion 之後,將結果組合在一起。如上例, _jname 若為 Void ,則第一行組出 CallStaticVoidMethod 名稱。
從字面上,我們能猜的出, dvmCallMethodV() 應該和執行 bytecode 關係密切。這時,在 Emacs 再按下 alt+. ,直接跳到 dvmCallMethodV() 的第一行,是在 vm/interp/Stack.c 檔案裡。看來這個 function 和 stack 有關。快速描視一次 function ,會發現幾個引人注意的字眼
* dvmIsNativeMethod()
o (*method->nativeFunc)(...)
o dvmInterpret()
* dvmPopFrame()
dvmIsNativeMethod() 分明就是判斷 method 是否為 native method (JNI),若是 native 則呼叫 method 的 nativeFunc()。若非 native 則直接呼叫 dvmInterpret(),在此眼冒火光,「Interpret」 關鍵字頂上罩著一片祥雲。dvmIsNativeMethod() 暗示了這是一個狀態的測試,「IS」是一個常見的關鍵字,一般我們會用 IsBusy() 、 IsClear() .... IsXXX() 命名測試物件、系統狀態的巨集、函數。其實,多看別人的 code 是一項非常重要的修練。雖然很多 programmer 並不看別人的 code ,但每個人都閉門造車,開發自已的習慣,那必定南轅北轍,如何能夠被了解。多看別人的 code ,潛移默化之下,自然能寫出更易於閱讀的程式碼。
dvmPopFrame() 暗示我們前面應該有個地方會造一個 frame。frame 是程式在呼叫函數的過程,在 stack 裡為該每次呼叫所保留的一塊空間,以儲存 function 狀態。這應該不用我多說,不知道可以去 search。search 還看不懂,表示要多讀點書。回到正題,往回一看,在function 的前面有一行
001 clazz = callPrep(self, method, obj, false);
call + Prep ,好像有準備 frame 的意涵,進去一看,在 callPrep() 頂頭的註解裡,明白寫了
* Pushes a call frame on, advancing self->curFrame.
看來是猜的沒錯。但我們暫時對 frame 沒興趣,回到 dvmCallMethodV()。
於是我們直接看 dvmInterpret() 的內容,在 vm/interp/Interp.c 裡。我們注意到幾個發亮的大字
* interpState
* stdInterp
o dvmMterpStd
o dvmInterpretStd
InterpState 看起來就是準備 interpret 所需的資料,有
* method
* fp
* pc
* entryPoint
個個看起來皆是和 Interpret 執行的狀態相關。往下看,stdInterp 依據 gDvm 的內容,決定是 dvmMterpStd 或 dvmInterpretStd 。而在後面看到
001 change = (*stdInterp)(self, &interpState);
看來 stdInterp 應該就是 Interpret 的進入點。並且有兩種 Interpreter 可以選。另外還有一個 debug 的版本 dvmInterpretDbg() ,但目前對 debug 不感興趣,略過。
若查一下 dvmInterpretStd() 和 dvmInterpretStd() 的位置,會發現都在 vm/mterp/ 目錄之下。dvmInterpretStd() 使用 ctags 找不到,於是用 grep 試試,會發現
001 ./mterp/out/InterpC-portstd.c:#define INTERP_FUNC_NAME dvmInterpretStd
002 ./mterp/portable/portstd.c:#define INTERP_FUNC_NAME dvmInterpretStd
在 vm/mterp/ 下的兩個檔案,用巨集 alias 這個符號,應該是有其它地方使用 INTERP_FUNC_NAME 定義了 function 內容。
之前聽說了, Dalvik 針對不同的平台,有特別用 assembly 進行最佳化。 dvmInterpretStd 和 dvmMpretStd 兩個版本的 interpreter 應該和此有關。觀察一下 vm/mterp 目錄下,有
* armv4
* armv5te
* x86
* ...
等平台有關的字眼,看是所言不假。
在 trace code 時,其實用到了許多直覺。直覺有高低之分,差異來自於平常知識累積。只要 trace 時,保靈活的聯想力,特殊字眼自然會和腦中知識聯結,進而發生直覺。這些知識有時不是來自於書本,道聽途說也可能變成有用的知識。像是前面提到針對不同 CPU 進行 optimize 的傳言。
待續......
Dalvik 程式碼分析與示範(一)
Dalvik 程式碼分析與示範(一)
by thinker
2 Columns
關鍵字:
Android
近來 Android 十分熱門, Google 的大動作,撼動整個業界。雖已震天撼地,和過去 MS 或 Apple 所興之波瀾相較,還是有些差距。身為一個技術研究者,新聞性似乎不是這麼重要,倒底葫蘆裡賣的是什麼藥,才是吾輩所想知道。小弟最近獲邀加入某團體,而擇主題研究,企圖改善國內 Open Source 的風氣和態度。於是著手分析 Dalvik 程式碼。
Dalvik 的成分
Dalvik 是一個 VM (Virtual Machine) ,相當於 Java 的 JVM 、 .Net 的 CLI 和 Python 、 Perl 、 Ruby 的 Interpreter 。Dalvik 定義自己的 bytecode ,為 VM 的指令,相當於 CPU 的機械碼。這些 bytecode 指令由 VM 進行解釋、執行。 Android 利用 Dalvik VM ,達到跨平台的目的。Dalvik 的程式碼目錄裡,又豈只 VM 而已,尚有一核心 library ,包括 Java Source Code 和 implement JNI 的 native method code。然,前述皆是 library ,非 VM 本體,不在本文討論之列。本文就 VM 本體進行分析和討論,並示範 trace code 的方法,希望有志學習者能因而獲益一、二。
Dalvik 目錄
Dalvik 目錄下,會看到二十來個檔案和目錄,有文件、Makefile 、library 、工具等。和 VM 有關者,唯
* dalvikvm/
* vm/
二者。dalvikvm/ 目錄實為 VM 的 main function ,透過呼叫 vm/ 目錄下所定義的 function ,初始並啟動 VM ,以執行指定程式。而 VM 的真正本體則在 vm/ 目錄之下。概欲研究 Dalvik 程式碼者,因集中於此二目錄,勿被其它目錄內的程式碼所迷惑。然而, docs/ 目錄下的文件,有許多有參考價值。有志者,或可一讀,為預習功課。研究過程中,遇不解名詞時,請隨時透過 search engine 進行查詢。
開始分析
分析任何系統之初,首要為找到程式的進入點。以 C 的 application 而言,就是 main() function ,在 dalvikvm/Main.c 裡。接著著手進行分析。然而,main() function 一開始就有滾滾江水、源源不絕之勢,十之八九,可能從第一行開始看,順著每一個呼叫追蹤而入,覓技微末節而不棄,直至迷途於茫茫程式碼之間,終至意志損耗殆盡而投降。正是見樹而不見林之害,分析之初唯恐避之不急。因此,謹記研究初衷為 VM 本身,而非相關初始設定。需先觀測 main() function ,再鎖定可疑的主體,再進行分析,以了解大架構為先,以達提綱挈領之效,不至於迷罔。
從 main() function 裡,我們大概能本能性的猜測到
* JNI_CreateJavaVM()
* FindClass()
* GetStaticMethodID()
* CallStaticVoidMethod()
這幾行,應該和 VM 本體關係較大。透過分析這幾個 function ,應該能了解程式碼的架構
* 主要 function
* function 和 .c file 間的關係
* 每個 .c file 主要 implement 哪一部分的功能
* 每個子目錄又個別負責什麼功能。
起初應以瞭解上面所提之事,熟悉程式碼的檔案和目錄結構而主,不可一頭栽入程式細節之中。
從上面幾行,幾可猜出, Dalvik 的執行,必需先建立、初始化一個 VM,然後透過 FindClass() 載入、並取得命令列指定的 class ,並透過 GetStaticMethodID() 取得 class 裡的 static method,最後執行該 method 。我們知道, Java 程式的執行,是透過指定一個 class , VM 假設該 class 有一 static method 名為 main ,以執行之。很快的,我們了解到, Dalvik 主要流程大概是上述的過程。
JNI_CreateJavaVM()
JNI_CreateJavaVM() 故名思義,應該就是建立、初始化一個 Dalvik VM ,用以執行指定的程式碼。我們必需先知道, Dalvik 幾乎是一個 Java VM ,因此有許多名詞是直接延用 Java 的用法,如 JNI 即是 Java Native Interface。作為 Java 和 native code 的介面。因此,此處 JNI 料想和 JNI 有關。有可能是提供給 Native code 呼叫的 function ,或用以呼叫 native code 。此處是 create VM ,因此可假設為提供 native code 呼叫。
接著,我們追入 JNI_CreateJavaVM() 看其葫蘆裡賣的是黑藥、白藥。這裡,請善用 ctags 搭配 editor,否則就用 grep 整個目錄才能找到 JNI_CreateJavaVM() 的位置。以 emacs 而言,將游標移至該 function 的位置,按 meta+. 或 alt+. 即觸發 emacs 查詢 ctags db 裡的相關資料,開啟定義該 function 的檔案,並移至該 function 的開頭。如: 在 JNI_CreateJavaVM() 上,按 alt+. ,emacs 即跳至 JNI_CreateJavaVM() function 首行。 JNI_CreateJavaVM() 是在 vm/Jni.c 檔案裡。因此, Jni.c 應該是 VM 提供的主要介面。
要確定 vm/Jni.c 的主要功能,最好是檢示一次裡面的 function 。若觀察 vm/ 目錄,或許發現 Jni.h 。大部分 programmer 在 coding 時,都有類似的習慣。例如:將 Jni.c 所定義的 function ,透過 Jni.h export 給其它 module 使用。這是一種常見的習慣,若一個 programmer 的 coding 習慣離一般習慣太遠,或變化甚大,甚至於毫無章法,該程式品質必然低落,也無研究價值。在研究它人程式碼時,需利用常用習慣。用之則有如神助,棄之則如牛犛田。一般 module ,會將其主要功能,以一組 function 和 data structure 加以定義,並 export 。然程式碼則充滿各式的實作細節,一一檢視何者為 export 的介面,何者為非,得花上大半時間。最好的方式是直接看 header file, .h file。header file 一般只有 export 的部分,因此沒有太多雜訊。檢視 module 的 header file 內容,大抵能了解一個 module 的功用。
然而,我們並沒有發現 Jni.h ,這時只有直接檢視 Jni.c 的內容。很幸運的, Dalvik 的 developer 們有良好習慣,將 local 使用的 function 都宣告為 static。這是一個常見的良好的習慣,因些我們可以很快略過 static function ,得知有哪些重要的 function 。為了更加快速,可利用 editor 的 folding 功能,讓 editor 收疊程式的內容,只 show 出最外層的 function 定義。以 emacs 為例,按 ctrl+c @ ctrl+alt+h 能達到前述功能。按 ctrl+c @ ctrl+alt+s 則顯示所有細節。
在 Jni.c 我們看到
* dvmJniStartup()
* dvmJniShutdown()
* dvmGetJNIEnvForThread()
* dvmCreateJNIEnv()
* dvmDestroyJNIEnv()
* dvmGetJNIRefType()
* dvmReleaseJniMonitors()
* dvmLateEnableCheckedJni()
* JNI_GetDefaultJavaVMInitArgs()
* JNI_GetCreatedJavaVMs()
* JNI_CreateJavaVM()
裡有數十個 static function 被我們略過,只需注意其中這幾個 global function 。一個 module 在介面定義了許多功能,其中只有一部分是重要的,其它則是補助性質。其中
* dvmJniStartup()
* dvmJniShutdown()
看起來是使用這個 module 必需進行初始化的 function 。於是我們 grep 一下目錄,發現在 dvmStartup() (in vm/Init.c) 裡呼叫了這個程式。dvmStartup() 也呼叫了許多其它 *Startup() function ,看來所有子系統的 initialization function 都在這裡呼叫。而 dvmStartup() 則被 JNI_CreateJavaVM() 所呼叫。因而我們可以猜測,所有的 sub-module 的 initialize 都透過 JNI_CreateJavaVM() 呼叫 dvmStartup() 而進行。包括 Jni.c 本身。
gDvm
在 main() function 裡,我們知道 vm 、 env 兩個未 local variable 是 pointer,並且未初始化 。 main() function 傳兩者的reference (address) 於 JNI_CreateJavaVM() ,此意乃 JNI_CreateJavaVm() 需初始化兩個 pointer 的內容。大抵上,developer coding 之時,不會將未初始化的 variable 的 reference 傳入其它 function ,除非該 function 需負責設定 variable 的內容。從變數名,吾者已能臆測 vm 應該是保留 JNI_CreateJavaVM() 所建立的 VM object 的位址。從 main() function 中,透過 env 呼叫 CallStaticVoidMethod() 、FindClass() 等等 function ,想必是 VM 提供之控制介面。
為確定猜想,必然要檢示 JNI_CreateJavaVM() 內容。pVM,即 main() function 的 vm ,是在此 function 裡 allocate memory ,並設定其內容。而 pEnv,即 main() function 的 env,則呼叫 dvmCreateJNIEnv() 而得。檢視 dvmCreateJNIEnv() 和 JNI_CreateJavaVM(),發現 pVM 和 pEnv 之間並沒有資料的關聯,也就是 pVM 和 pEnv 並沒有相互指向對方,應該兩者無關。但在 main() function 又有暗示兩者之間的關聯。於是在 JNI_CreateJavaVM() 裡,可以發現 gDvm 這個 global variable 。這個變數似乎是由 JNI_CreateJavaVM() 所初始化。 Global variable 通常意指系統有些 function 是透過這些 variable ,相互分享資料。難道 pVM 和 pEnv 的關聯,即透過 gDvm 建立?
pVM 和 pEnv 是透過 casting 才 assign 給 main() function 的 vm 和 env
001 *p_env = (JNIEnv*) pEnv;
002 *p_vm = (JavaVM*) pVM;
這通常意謂著 pEnv 開頭第一個欄位是 JNIEnv 型別,而 pVM 的頭一個欄位是 JavaVM 型別。這是 C programmer 常用的技巧,隱藏實作所需的資料,將必需 export 的資料,放在第一個欄位裡透過 casting export 出去,並避免 user 看到其它資料。至此,讀者或會發現,trace 別人的 code 往往必需了解作者的習慣和技巧。所幸好用的習慣的技術,會被大多數人接受、採用。也因此,要想 trace 別人的 code ,自己本身也需要有一定的 coding 的技術,並透過閱讀別人的 code ,累積一些大多數常用的技巧和習慣。幾年前有一本書,名為 Code Reading (by by Diomidis Spinellis)。這本書應該加入這些材料才對。
為確定 pVM 和 pEnv 的頭一個欄位,我們檢識 JniInternal.h 裡的 JavaVMExt 和 JNIEnvExt,發覺分別是 JNIInvokeInterface* 和 JNINativeInterface* 型別。這是怎麼回事? 於是回頭檢查 JavaVM 和 JNIEnv ,發覺
001 #if defined(__cplusplus)
002 typedef _JNIEnv JNIEnv;
003 typedef _JavaVM JavaVM;
004 #else
005 typedef const struct JNINativeInterface* JNIEnv;
006 typedef const struct JNIInvokeInterface* JavaVM;
007 #endif
很明顯,我們是使用 C ,非 C++ ,所以是下半部。沒錯,正和前面所發現的相同。於是確定 pVM 和 pEnv 的第一個欄位 export 了 main() function 裡的 vm 和 env 所提供的內容。
再仔細檢視 JNI_CreateJavaVM() 和 dvmCreateJNIEnv() ,會發覺頭個欄位的內容分別定義在 gInvokeInterface 和 gNativeInterface 。而兩者的內容則是列了一堆 function pointer ,指向許多 function 。完全證實我們之前所猜測的。
至此,讀者應已發現。我在 trace code 時,並不是一行一行的看。而是大膽假設、然後小心求證、再假設、再求證,跳躍式 trace source code 。試想,programmer 在寫程式時也不是線性的,一行一行,一口氣從第一行寫到最後一行。大抵上也是一次 implement 一個概念,然後一個概念接著一個概念,慢慢將 code 填上去。一個 function 或許來來回回修了很多次。
gInvokeInterface 和 gInvokeInterface 所列之 function ,似乎全為 Jni.c 裡面的 static function 。我們在 Jni.c 裡 search gDvm ,看看這些 function 是否有 access gDvm 及其用法,發覺使用的不多。再仔細看看 JavaVMExt 和 JNIEnvExt 的定義,發覺沒有存放什麼狀況。反倒是 gDvm 的型別 DvmGlobals 存放了許多資料,似乎 VM 的狀態大多存放在此。如果一個系統將狀態存放在 global variables,那意謂著該系統在一個 process 只能執行一份。看來 Dalvik 目前在一個 process 裡,只允許一個 VM 。我在寫程式極力避免這種情形,以保有單一 process 執行多份相同的 instance 的可能性。然而, global variable 倒是偷懶的好方法,如果你很確定不想保有此彈性。
寫到這了,發覺這一篇文章似乎不是一篇,而是好多篇。所以。待續.....
by thinker
2 Columns
關鍵字:
Android
近來 Android 十分熱門, Google 的大動作,撼動整個業界。雖已震天撼地,和過去 MS 或 Apple 所興之波瀾相較,還是有些差距。身為一個技術研究者,新聞性似乎不是這麼重要,倒底葫蘆裡賣的是什麼藥,才是吾輩所想知道。小弟最近獲邀加入某團體,而擇主題研究,企圖改善國內 Open Source 的風氣和態度。於是著手分析 Dalvik 程式碼。
Dalvik 的成分
Dalvik 是一個 VM (Virtual Machine) ,相當於 Java 的 JVM 、 .Net 的 CLI 和 Python 、 Perl 、 Ruby 的 Interpreter 。Dalvik 定義自己的 bytecode ,為 VM 的指令,相當於 CPU 的機械碼。這些 bytecode 指令由 VM 進行解釋、執行。 Android 利用 Dalvik VM ,達到跨平台的目的。Dalvik 的程式碼目錄裡,又豈只 VM 而已,尚有一核心 library ,包括 Java Source Code 和 implement JNI 的 native method code。然,前述皆是 library ,非 VM 本體,不在本文討論之列。本文就 VM 本體進行分析和討論,並示範 trace code 的方法,希望有志學習者能因而獲益一、二。
Dalvik 目錄
Dalvik 目錄下,會看到二十來個檔案和目錄,有文件、Makefile 、library 、工具等。和 VM 有關者,唯
* dalvikvm/
* vm/
二者。dalvikvm/ 目錄實為 VM 的 main function ,透過呼叫 vm/ 目錄下所定義的 function ,初始並啟動 VM ,以執行指定程式。而 VM 的真正本體則在 vm/ 目錄之下。概欲研究 Dalvik 程式碼者,因集中於此二目錄,勿被其它目錄內的程式碼所迷惑。然而, docs/ 目錄下的文件,有許多有參考價值。有志者,或可一讀,為預習功課。研究過程中,遇不解名詞時,請隨時透過 search engine 進行查詢。
開始分析
分析任何系統之初,首要為找到程式的進入點。以 C 的 application 而言,就是 main() function ,在 dalvikvm/Main.c 裡。接著著手進行分析。然而,main() function 一開始就有滾滾江水、源源不絕之勢,十之八九,可能從第一行開始看,順著每一個呼叫追蹤而入,覓技微末節而不棄,直至迷途於茫茫程式碼之間,終至意志損耗殆盡而投降。正是見樹而不見林之害,分析之初唯恐避之不急。因此,謹記研究初衷為 VM 本身,而非相關初始設定。需先觀測 main() function ,再鎖定可疑的主體,再進行分析,以了解大架構為先,以達提綱挈領之效,不至於迷罔。
從 main() function 裡,我們大概能本能性的猜測到
* JNI_CreateJavaVM()
* FindClass()
* GetStaticMethodID()
* CallStaticVoidMethod()
這幾行,應該和 VM 本體關係較大。透過分析這幾個 function ,應該能了解程式碼的架構
* 主要 function
* function 和 .c file 間的關係
* 每個 .c file 主要 implement 哪一部分的功能
* 每個子目錄又個別負責什麼功能。
起初應以瞭解上面所提之事,熟悉程式碼的檔案和目錄結構而主,不可一頭栽入程式細節之中。
從上面幾行,幾可猜出, Dalvik 的執行,必需先建立、初始化一個 VM,然後透過 FindClass() 載入、並取得命令列指定的 class ,並透過 GetStaticMethodID() 取得 class 裡的 static method,最後執行該 method 。我們知道, Java 程式的執行,是透過指定一個 class , VM 假設該 class 有一 static method 名為 main ,以執行之。很快的,我們了解到, Dalvik 主要流程大概是上述的過程。
JNI_CreateJavaVM()
JNI_CreateJavaVM() 故名思義,應該就是建立、初始化一個 Dalvik VM ,用以執行指定的程式碼。我們必需先知道, Dalvik 幾乎是一個 Java VM ,因此有許多名詞是直接延用 Java 的用法,如 JNI 即是 Java Native Interface。作為 Java 和 native code 的介面。因此,此處 JNI 料想和 JNI 有關。有可能是提供給 Native code 呼叫的 function ,或用以呼叫 native code 。此處是 create VM ,因此可假設為提供 native code 呼叫。
接著,我們追入 JNI_CreateJavaVM() 看其葫蘆裡賣的是黑藥、白藥。這裡,請善用 ctags 搭配 editor,否則就用 grep 整個目錄才能找到 JNI_CreateJavaVM() 的位置。以 emacs 而言,將游標移至該 function 的位置,按 meta+. 或 alt+. 即觸發 emacs 查詢 ctags db 裡的相關資料,開啟定義該 function 的檔案,並移至該 function 的開頭。如: 在 JNI_CreateJavaVM() 上,按 alt+. ,emacs 即跳至 JNI_CreateJavaVM() function 首行。 JNI_CreateJavaVM() 是在 vm/Jni.c 檔案裡。因此, Jni.c 應該是 VM 提供的主要介面。
要確定 vm/Jni.c 的主要功能,最好是檢示一次裡面的 function 。若觀察 vm/ 目錄,或許發現 Jni.h 。大部分 programmer 在 coding 時,都有類似的習慣。例如:將 Jni.c 所定義的 function ,透過 Jni.h export 給其它 module 使用。這是一種常見的習慣,若一個 programmer 的 coding 習慣離一般習慣太遠,或變化甚大,甚至於毫無章法,該程式品質必然低落,也無研究價值。在研究它人程式碼時,需利用常用習慣。用之則有如神助,棄之則如牛犛田。一般 module ,會將其主要功能,以一組 function 和 data structure 加以定義,並 export 。然程式碼則充滿各式的實作細節,一一檢視何者為 export 的介面,何者為非,得花上大半時間。最好的方式是直接看 header file, .h file。header file 一般只有 export 的部分,因此沒有太多雜訊。檢視 module 的 header file 內容,大抵能了解一個 module 的功用。
然而,我們並沒有發現 Jni.h ,這時只有直接檢視 Jni.c 的內容。很幸運的, Dalvik 的 developer 們有良好習慣,將 local 使用的 function 都宣告為 static。這是一個常見的良好的習慣,因些我們可以很快略過 static function ,得知有哪些重要的 function 。為了更加快速,可利用 editor 的 folding 功能,讓 editor 收疊程式的內容,只 show 出最外層的 function 定義。以 emacs 為例,按 ctrl+c @ ctrl+alt+h 能達到前述功能。按 ctrl+c @ ctrl+alt+s 則顯示所有細節。
在 Jni.c 我們看到
* dvmJniStartup()
* dvmJniShutdown()
* dvmGetJNIEnvForThread()
* dvmCreateJNIEnv()
* dvmDestroyJNIEnv()
* dvmGetJNIRefType()
* dvmReleaseJniMonitors()
* dvmLateEnableCheckedJni()
* JNI_GetDefaultJavaVMInitArgs()
* JNI_GetCreatedJavaVMs()
* JNI_CreateJavaVM()
裡有數十個 static function 被我們略過,只需注意其中這幾個 global function 。一個 module 在介面定義了許多功能,其中只有一部分是重要的,其它則是補助性質。其中
* dvmJniStartup()
* dvmJniShutdown()
看起來是使用這個 module 必需進行初始化的 function 。於是我們 grep 一下目錄,發現在 dvmStartup() (in vm/Init.c) 裡呼叫了這個程式。dvmStartup() 也呼叫了許多其它 *Startup() function ,看來所有子系統的 initialization function 都在這裡呼叫。而 dvmStartup() 則被 JNI_CreateJavaVM() 所呼叫。因而我們可以猜測,所有的 sub-module 的 initialize 都透過 JNI_CreateJavaVM() 呼叫 dvmStartup() 而進行。包括 Jni.c 本身。
gDvm
在 main() function 裡,我們知道 vm 、 env 兩個未 local variable 是 pointer,並且未初始化 。 main() function 傳兩者的reference (address) 於 JNI_CreateJavaVM() ,此意乃 JNI_CreateJavaVm() 需初始化兩個 pointer 的內容。大抵上,developer coding 之時,不會將未初始化的 variable 的 reference 傳入其它 function ,除非該 function 需負責設定 variable 的內容。從變數名,吾者已能臆測 vm 應該是保留 JNI_CreateJavaVM() 所建立的 VM object 的位址。從 main() function 中,透過 env 呼叫 CallStaticVoidMethod() 、FindClass() 等等 function ,想必是 VM 提供之控制介面。
為確定猜想,必然要檢示 JNI_CreateJavaVM() 內容。pVM,即 main() function 的 vm ,是在此 function 裡 allocate memory ,並設定其內容。而 pEnv,即 main() function 的 env,則呼叫 dvmCreateJNIEnv() 而得。檢視 dvmCreateJNIEnv() 和 JNI_CreateJavaVM(),發現 pVM 和 pEnv 之間並沒有資料的關聯,也就是 pVM 和 pEnv 並沒有相互指向對方,應該兩者無關。但在 main() function 又有暗示兩者之間的關聯。於是在 JNI_CreateJavaVM() 裡,可以發現 gDvm 這個 global variable 。這個變數似乎是由 JNI_CreateJavaVM() 所初始化。 Global variable 通常意指系統有些 function 是透過這些 variable ,相互分享資料。難道 pVM 和 pEnv 的關聯,即透過 gDvm 建立?
pVM 和 pEnv 是透過 casting 才 assign 給 main() function 的 vm 和 env
001 *p_env = (JNIEnv*) pEnv;
002 *p_vm = (JavaVM*) pVM;
這通常意謂著 pEnv 開頭第一個欄位是 JNIEnv 型別,而 pVM 的頭一個欄位是 JavaVM 型別。這是 C programmer 常用的技巧,隱藏實作所需的資料,將必需 export 的資料,放在第一個欄位裡透過 casting export 出去,並避免 user 看到其它資料。至此,讀者或會發現,trace 別人的 code 往往必需了解作者的習慣和技巧。所幸好用的習慣的技術,會被大多數人接受、採用。也因此,要想 trace 別人的 code ,自己本身也需要有一定的 coding 的技術,並透過閱讀別人的 code ,累積一些大多數常用的技巧和習慣。幾年前有一本書,名為 Code Reading (by by Diomidis Spinellis)。這本書應該加入這些材料才對。
為確定 pVM 和 pEnv 的頭一個欄位,我們檢識 JniInternal.h 裡的 JavaVMExt 和 JNIEnvExt,發覺分別是 JNIInvokeInterface* 和 JNINativeInterface* 型別。這是怎麼回事? 於是回頭檢查 JavaVM 和 JNIEnv ,發覺
001 #if defined(__cplusplus)
002 typedef _JNIEnv JNIEnv;
003 typedef _JavaVM JavaVM;
004 #else
005 typedef const struct JNINativeInterface* JNIEnv;
006 typedef const struct JNIInvokeInterface* JavaVM;
007 #endif
很明顯,我們是使用 C ,非 C++ ,所以是下半部。沒錯,正和前面所發現的相同。於是確定 pVM 和 pEnv 的第一個欄位 export 了 main() function 裡的 vm 和 env 所提供的內容。
再仔細檢視 JNI_CreateJavaVM() 和 dvmCreateJNIEnv() ,會發覺頭個欄位的內容分別定義在 gInvokeInterface 和 gNativeInterface 。而兩者的內容則是列了一堆 function pointer ,指向許多 function 。完全證實我們之前所猜測的。
至此,讀者應已發現。我在 trace code 時,並不是一行一行的看。而是大膽假設、然後小心求證、再假設、再求證,跳躍式 trace source code 。試想,programmer 在寫程式時也不是線性的,一行一行,一口氣從第一行寫到最後一行。大抵上也是一次 implement 一個概念,然後一個概念接著一個概念,慢慢將 code 填上去。一個 function 或許來來回回修了很多次。
gInvokeInterface 和 gInvokeInterface 所列之 function ,似乎全為 Jni.c 裡面的 static function 。我們在 Jni.c 裡 search gDvm ,看看這些 function 是否有 access gDvm 及其用法,發覺使用的不多。再仔細看看 JavaVMExt 和 JNIEnvExt 的定義,發覺沒有存放什麼狀況。反倒是 gDvm 的型別 DvmGlobals 存放了許多資料,似乎 VM 的狀態大多存放在此。如果一個系統將狀態存放在 global variables,那意謂著該系統在一個 process 只能執行一份。看來 Dalvik 目前在一個 process 裡,只允許一個 VM 。我在寫程式極力避免這種情形,以保有單一 process 執行多份相同的 instance 的可能性。然而, global variable 倒是偷懶的好方法,如果你很確定不想保有此彈性。
寫到這了,發覺這一篇文章似乎不是一篇,而是好多篇。所以。待續.....
Record Your Linux Desktop in an Animated GIF with Byzanz
Record Your Linux Desktop in an Animated GIF with Byzanz
Wed, 06/10/2009 - 08:25 — hotice — webupd8.blogspot.com
Byzanz is a Linux program that lets you capture your screen but not in a video but in an animated GIF image. This is very useful for tutorial or presentations and you will definitely explain better using an image than in words when describing an action. You can use it to capture a whole your desktop, a window or a region you select.
Wed, 06/10/2009 - 08:25 — hotice — webupd8.blogspot.com
Byzanz is a Linux program that lets you capture your screen but not in a video but in an animated GIF image. This is very useful for tutorial or presentations and you will definitely explain better using an image than in words when describing an action. You can use it to capture a whole your desktop, a window or a region you select.
Debugging Tip: Trace the Process and See What It is Doing with strace
Debugging Tip: Trace the Process and See What It is Doing with strace
strace is a useful diagnostic, instructional, and debugging tool. It can save lots of headache. System administrators, diagnosticians and trouble-shooters will find it invaluable for solving problems with programs for which the source is not readily available since they do not need to be recompiled in order to trace them. This is also useful to submit bug reports to open source developers.
Each line in the trace contains the system call name, followed by its arguments in parentheses and its return value.
Run strace against /bin/foo and capture its output to a text file in output.txt:
* -o filename : Write the trace output to the file filename rather than to screen (stderr).
* -p PID : Attach to the process with the process ID pid and begin tracing. The trace may be terminated at any time by a keyboard interrupt signal (hit CTRL-C). strace will respond by detaching itself from the traced process(es) leaving it (them) to continue running. Multiple -p options can be used to attach to up to 32 processes in addition to command (which is optional if at least one -p option is given).
* -s SIZE : Specify the maximum string size to print (the default is 32).
Refer to strace man page for more information:
Continue reading rest of the How to report / Find a bug under Linux series.
strace is a useful diagnostic, instructional, and debugging tool. It can save lots of headache. System administrators, diagnosticians and trouble-shooters will find it invaluable for solving problems with programs for which the source is not readily available since they do not need to be recompiled in order to trace them. This is also useful to submit bug reports to open source developers.
Each line in the trace contains the system call name, followed by its arguments in parentheses and its return value.
Run strace against /bin/foo and capture its output to a text file in output.txt:
$ strace -o output.txt /bin/fooYou can strace the webserver process and see what it's doing. For example, strace php5 fastcgi process, enter:
$ strace -p 22254 -s 80 -o /tmp/debug.lighttpd.txtTo see only a trace of the open, read system calls, enter :
$ strace -e trace=open,read -p 22254 -s 80 -o debug.webserver.txtWhere,
* -o filename : Write the trace output to the file filename rather than to screen (stderr).
* -p PID : Attach to the process with the process ID pid and begin tracing. The trace may be terminated at any time by a keyboard interrupt signal (hit CTRL-C). strace will respond by detaching itself from the traced process(es) leaving it (them) to continue running. Multiple -p options can be used to attach to up to 32 processes in addition to command (which is optional if at least one -p option is given).
* -s SIZE : Specify the maximum string size to print (the default is 32).
Refer to strace man page for more information:
$ man strace
Continue reading rest of the How to report / Find a bug under Linux series.
Sunday, October 25, 2009
PHP Debugging, Testing, and Profiling
PHP:
Debugging, Testing, and Profiling
ATLPHP - Atlanta PHP Users Group
January 2007
Who Am I?
Alan Pinstein
Born 1974, programming since 1980
C, C++, Obj-C, PHP, Java, etc...
PalmOS, Apple (Cocoa), Web Applications
PHP since 1998
Owner / Developer @ Showcase Web Sites
Involved in open source community:
PHOCOA - http://phocoa.com
StatViz - http://statviz.sourceforge.net/
Propel - http://propel.phpdb.org/trac/
Overview
Debugging
Finding and Fixing Problems
Logic Testing
Automating Code Testing
Performance Testing
Benchmarking
Profiling
Audience Survey
Who has used these tools?
An IDE? (Zend, PHPEd, Komodo, Eclipse)
Debug extensions? (apd, xdebug, dbg)
error_reporting(E_STRICT)?
Test rigs? (PHPUnit)
Platform?
Windows?
Unix / Linux / Mac?
PHP5?
Debugging: Environment
Text Editor
IDE - Integrated Development Environment
Text editor with code completion, syntax
coloring, documentation, etc.
Visual Debugger
Examples: Zend, Komodo, PHPEd, Eclipse, etc.
Eclipse is free, but tricky
The rest are commercial, but tend to have
trials and cheap personal licenses (<$100)
Debugging: Tools
Built-in PHP functions
View data
Verbose warnings
Debugging Extensions
Stack traces
Command-line debuggers
Examples: apd, xdebug, dbg
PHP Modules
PEAR::Log
PHPUnit3
Debugging: Setup
Debug Extensions
pecl install xdebug-beta
pecl install apd
fix formatting bug:
http://pecl.php.net/bugs/bug.php?id=4569
Need to edit php.ini to activate
PHP Modules
pear channel-discover pear.phpunit.de
pear install phpunit/PHPUnit
Debugging: PHP Functions
error_reporting(E_ALL | E_STRICT);
Display helpful warnings such as use of uninitialized
variables
ini_set(¡¥display_errors¡¦, true);
Print errors to output rather than just in web
server log
Output Functions:
print
print_r - print data formatted
var_dump - print data formatted (more info)
die - print and stop execution
throw - throw an exception (catchable)
PEAR::Log - write to file, display, etc
Debugging: XDebug
XDebug enhances normal PHP error capabilities
Enable xdebug in php.ini:
zend_extension=/opt/local/lib/php/extensions/no-debug-nonzts-
20060613/xdebug.so
Examples:
Stack traces on errors and warnings
Pretty-print version of var_dump
Debugging: ZDE
View source code with syntax coloring
Set breakpoints
See call stack and view data
Testing
Why write tests?
Makes you think about all possible inputs/
outputs
Spot new bugs quickly - ¡§regression testing¡¨
Testing: Automation
Unit Test - test a single ¡§subunit¡¨ of a program
programmatically
Integration Test - test the integration between
multiple units
Browser Test - test the way the app performs
in a browser
Testing: Writing Unit Tests
Write a unit test for a Calculator class
Test both success and failure
Run test(s) with PHPUnit
Testing with PHPUnit3
PHPUnit3 has comment-based testing!
No need to write any code to perform simle
tests
Note: PHPUnit3 and Xdebug are incompatible
Testing: Example
PHPUnit3 Can build tests from comments for simple
¡§assert¡¨ tests
Or you can write specific
test classes
Testing: Notes
When an error occurs, is the bug in the code
or in the test itself?
When to write tests?
Before or After writing the ¡§real¡¨ code?
Over time - when new situations arise
Who should write tests?
Ideally, someone besides the unit
programmer
Higher level testing with other tools
Selenium - automate browser-based tests
Code Coverage
Profiling
What is Profiling?
Examines the p a program erformance characteristics of
Records every function call
Tracks execution time of all function calls
Tracks memory usage
Preferred tool: apd
XDebug generates only cachegrind files
Requires kcachegrind (X-windows) to view
apd is simpler to use but just as powerful
Profiling and Benchmarking
Why profile?
Faster / more efficient code
How do you measure speed?
Benchmarking!
How fast does it need to be?
Anticipate your needs; meet your needs
Make it scalable
Beware!
Perfection is the enemy of good enough
Know when to quit
No function taking more than 1-2%
Profiling and Benchmarking:
Example
How can we measure performance?
Single script execution time
only useful on long-execution-time scripts
Time to execute same script multiple times
Benchmarking tools - establish a baseline
ab = Apache Benchmark
ab -n 100 -c 1 http://10.0.1.202:8080/php-talk/profile.php
ab output
Profiling and Benchmarking:
Example
Turn on profiling programmatically with:
apd_set_pprof_trace('.');
Run script once
Evaluate ¡§dump¡¨ with pprofp
Profiling and Benchmarking:
Example
Look for:
Functions with high % of time
Functions that take a lot of time
Functions called lots of times
real - total elapsed time from start to finish
system - I/O, kernel, etc.
user - YOUR CODE!
real != system + user
Why? Resource contention, I/O bottlenecks
pprofp -z pprofp.1234
Profiling and Benchmarking:
Example
Better, but... % of time in makeWorldPeace() is still high...
Profiling is an art and a science...
Benchmark to confirm
Elapsed time down to 0.01s from 0.19s
Re-profile with performance improvements
ab results from: to:
Resources
Debug Extensions
apd - http://pecl.php.net/package/apd
xdebug - http://xdebug.org
IDEs
Zend
http://www.zend.com
Komodo
http://www.activestate.com/Products/Komodo
PHPEd
http://www.nusphere.com
Resources
Testing
PHPUnit
http://www.phpunit.de/
Great PHPUnit Presentation:
http://sebastian-bergmann.de/talks/2006-11-02-
PHPUnit.pdf
Selenium
http://www.openqa.org/selenium-rc/
Benchmarking
ab
http://httpd.apache.org/docs/2.0/programs/ab.html
siege
http://www.joedog.org/JoeDog/Siege
PEAR::Benchmark
http://pear.php.net/package/Benchmark
Q & A
Debugging, Testing, and Profiling
ATLPHP - Atlanta PHP Users Group
January 2007
Who Am I?
Alan Pinstein
Born 1974, programming since 1980
C, C++, Obj-C, PHP, Java, etc...
PalmOS, Apple (Cocoa), Web Applications
PHP since 1998
Owner / Developer @ Showcase Web Sites
Involved in open source community:
PHOCOA - http://phocoa.com
StatViz - http://statviz.sourceforge.net/
Propel - http://propel.phpdb.org/trac/
Overview
Debugging
Finding and Fixing Problems
Logic Testing
Automating Code Testing
Performance Testing
Benchmarking
Profiling
Audience Survey
Who has used these tools?
An IDE? (Zend, PHPEd, Komodo, Eclipse)
Debug extensions? (apd, xdebug, dbg)
error_reporting(E_STRICT)?
Test rigs? (PHPUnit)
Platform?
Windows?
Unix / Linux / Mac?
PHP5?
Debugging: Environment
Text Editor
IDE - Integrated Development Environment
Text editor with code completion, syntax
coloring, documentation, etc.
Visual Debugger
Examples: Zend, Komodo, PHPEd, Eclipse, etc.
Eclipse is free, but tricky
The rest are commercial, but tend to have
trials and cheap personal licenses (<$100)
Debugging: Tools
Built-in PHP functions
View data
Verbose warnings
Debugging Extensions
Stack traces
Command-line debuggers
Examples: apd, xdebug, dbg
PHP Modules
PEAR::Log
PHPUnit3
Debugging: Setup
Debug Extensions
pecl install xdebug-beta
pecl install apd
fix formatting bug:
http://pecl.php.net/bugs/bug.php?id=4569
Need to edit php.ini to activate
PHP Modules
pear channel-discover pear.phpunit.de
pear install phpunit/PHPUnit
Debugging: PHP Functions
error_reporting(E_ALL | E_STRICT);
Display helpful warnings such as use of uninitialized
variables
ini_set(¡¥display_errors¡¦, true);
Print errors to output rather than just in web
server log
Output Functions:
print_r - print data formatted
var_dump - print data formatted (more info)
die - print and stop execution
throw - throw an exception (catchable)
PEAR::Log - write to file, display, etc
Debugging: XDebug
XDebug enhances normal PHP error capabilities
Enable xdebug in php.ini:
zend_extension=/opt/local/lib/php/extensions/no-debug-nonzts-
20060613/xdebug.so
Examples:
Stack traces on errors and warnings
Pretty-print version of var_dump
Debugging: ZDE
View source code with syntax coloring
Set breakpoints
See call stack and view data
Testing
Why write tests?
Makes you think about all possible inputs/
outputs
Spot new bugs quickly - ¡§regression testing¡¨
Testing: Automation
Unit Test - test a single ¡§subunit¡¨ of a program
programmatically
Integration Test - test the integration between
multiple units
Browser Test - test the way the app performs
in a browser
Testing: Writing Unit Tests
Write a unit test for a Calculator class
Test both success and failure
Run test(s) with PHPUnit
Testing with PHPUnit3
PHPUnit3 has comment-based testing!
No need to write any code to perform simle
tests
Note: PHPUnit3 and Xdebug are incompatible
Testing: Example
PHPUnit3 Can build tests from comments for simple
¡§assert¡¨ tests
Or you can write specific
test classes
Testing: Notes
When an error occurs, is the bug in the code
or in the test itself?
When to write tests?
Before or After writing the ¡§real¡¨ code?
Over time - when new situations arise
Who should write tests?
Ideally, someone besides the unit
programmer
Higher level testing with other tools
Selenium - automate browser-based tests
Code Coverage
Profiling
What is Profiling?
Examines the p a program erformance characteristics of
Records every function call
Tracks execution time of all function calls
Tracks memory usage
Preferred tool: apd
XDebug generates only cachegrind files
Requires kcachegrind (X-windows) to view
apd is simpler to use but just as powerful
Profiling and Benchmarking
Why profile?
Faster / more efficient code
How do you measure speed?
Benchmarking!
How fast does it need to be?
Anticipate your needs; meet your needs
Make it scalable
Beware!
Perfection is the enemy of good enough
Know when to quit
No function taking more than 1-2%
Profiling and Benchmarking:
Example
How can we measure performance?
Single script execution time
only useful on long-execution-time scripts
Time to execute same script multiple times
Benchmarking tools - establish a baseline
ab = Apache Benchmark
ab -n 100 -c 1 http://10.0.1.202:8080/php-talk/profile.php
ab output
Profiling and Benchmarking:
Example
Turn on profiling programmatically with:
apd_set_pprof_trace('.');
Run script once
Evaluate ¡§dump¡¨ with pprofp
Profiling and Benchmarking:
Example
Look for:
Functions with high % of time
Functions that take a lot of time
Functions called lots of times
real - total elapsed time from start to finish
system - I/O, kernel, etc.
user - YOUR CODE!
real != system + user
Why? Resource contention, I/O bottlenecks
pprofp -z pprofp.1234
Profiling and Benchmarking:
Example
Better, but... % of time in makeWorldPeace() is still high...
Profiling is an art and a science...
Benchmark to confirm
Elapsed time down to 0.01s from 0.19s
Re-profile with performance improvements
ab results from: to:
Resources
Debug Extensions
apd - http://pecl.php.net/package/apd
xdebug - http://xdebug.org
IDEs
Zend
http://www.zend.com
Komodo
http://www.activestate.com/Products/Komodo
PHPEd
http://www.nusphere.com
Resources
Testing
PHPUnit
http://www.phpunit.de/
Great PHPUnit Presentation:
http://sebastian-bergmann.de/talks/2006-11-02-
PHPUnit.pdf
Selenium
http://www.openqa.org/selenium-rc/
Benchmarking
ab
http://httpd.apache.org/docs/2.0/programs/ab.html
siege
http://www.joedog.org/JoeDog/Siege
PEAR::Benchmark
http://pear.php.net/package/Benchmark
Q & A
Profiling
Profiling: Objective Analysis "Profiling" means running your code in a controlled environment to see what actually happens, not what you think happens.
Wednesday, October 21, 2009
Tips on speeding up your Drupal sites
Published Fri, 2007/09/21 - 23:34, Updated Fri, 2007/09/21 - 23:38
We get asked a lot how to make a Drupal site faster. Therefore, we list here some tips on how to achieve such a goal.
These guidelines apply for sites that start to pick up traffic (e.g. more than a few thousand page views a day).
If you have tips to share, please add them as a comment below.
Avoid shared hosting
Shared hosting means that you cannot control, let alone know, the load other web sites impose on the server you are hosted on. Moreover, you cannot tune things such as database variables, nor install additional components (e.g. a PHP accelerator).
So, unless your site receive a few thousands page views a day, avoid shared hosting altogether.
Go for VPS or dedicated servers
With shared hosting eliminated from the picture, you are basically left with either a VPS (Virtual Private Server) or a dedicated server. A VPS is an reserved slice within a real dedicated server that you control yourself, i.e. you have root access, can install any software you like, ..etc.
VPS based on Xen are very cost effective, and provide excellent performance, provided you have enough memory.
At what point should you move from a VPS to a dedicated server? It is hard to say since each site is different, has different module, has different usage pattern, ...etc. But as a general imprecise rule, a VPS should be able to handle up to 50,000 to 75,000 page view per day with acceptable performance. Above that, and you are better off with a properly configured dedicated server with appropriate memory and CPU.
Install as much memory as you can afford
Memory is cheap compared to the expense invovled in hand tuning and squeezing the most out of a limited memory machine. The extra memory not only allows extra Apache processes to run so that you can handle traffic spikes gracefully, but it also buffers the filesystem and the database, so disk I/O is avoided altogether, and queries are fast.
Avoid dedicated with shared database hosts
Some VPS / dedicated hosts do not give you the option of hosting MySQL on your dedicated server itself, and force you to have the database on a separate server, shared with other databases of other servers! This is a very bad setup for a medium to large Drupal site, since Drupal does a lot of database queries per request. If the database server that is assigned to you serves other busy web sites, your VPS / dedicated server will sit idle waiting for the database requests to be served, and your site will be slow.
Dreamhost is one company that forces the database to be on a remote server. Here is a recent example of how an increasingly popular web site was bogged down because of that architecture.
Use a PHP Accelerator/op-code cache
Using a PHP accelerator (op-code cache) is a must for medium to large web site. The popular free ones are APC, eAcceleartor and Xcache. Each has its pros and cons, and depending on the version(s) of PHP you use, some may be more stable than others.
Enable MySQL query cache
The single most important MySQL factor for large sites is a well tuned MySQL query cache. MySQL will store the results of queries in a cache so subsequent requests to get served by, bypassing the need to re-run the queries.
Make sure you have MySQL's query cache configured with an appropriate size.
Avoid the open buffet syndrome for contributed modules
With so many modules available for Drupal, site admins/builders are tempted to install too many of them, either to try them out, or to have extra cool features offered by said modules.
The down side is that additional modules add to the size of each Apache process, consuming more memory. Moreover, most modules have their own database tables, and cause more queries to be done, adding some overhead.
As well, some modules have untuned non-scalable code, that buckles down under heavy load.
So, avoid using extra modules as much as you can. Failing that, use modules that are proven to be scalable.
Enable Drupal caching
Drupal has a built in cache mechanism. Some of that cache is always on, and cannot be turned off (e.g. menu, filter and variable caches). The page cache is optional and, for anonymous users, collapses many database queries into one query.
So, if your site has a large proportion of anonymous to registered visitors, enable the page cache. For more performance you can also consider the aggressive cache as well.
Consider file based caching (boost)
There is a file based caching module called boost that stores the cached pages in flat files outside the database. This causes anonymous requests to be served directly by Apache from the file system, and thus avoiding Drupal's bootstrap and database I/O altogether.
Use this module if you know how to setup Apache .htaccess.
Use memcache if you know how to set it up
There is a memcache module which causes the Drupal cache to be in memory rather than in the database, thus avoiding database and disk I/O overhead.
Setting up this module for Drupal 5 is a bit involved, but if you are brave enough, it is worth the effort.
Read articles at 2bits.com
There is much more to be said about performance of a Drupal web site.
2bits posts regular performance tuning and optimization articles. So, go and read about the above points, and much more. You can also subscribe to our articles feed to get these documents as they get posted.
Hire 2bits to tune your Drupal web site
As part of our Drupal consulting services, 2bits offers a performance tuning and optimization service. Use the Contact link to ask for an engagement.
‹ The importance of tuning APC for sites with a high number of Drupal modules up Tools for Performance Tuning and Optimization ›
»
* Add new comment
It's ironic I came across
Submitted by mimo (not verified) on Tue, 2008/09/16 - 07:01.
It's ironic I came across this article while looking for ideas on improving drupal performance as I work as a sysadmin for a small hosting provider.
I think your points are well made but they depend on a premise that hosting providers don't care about individual web sites and don't want to provide the best for their clients. Maybe that's the case. In our case it's not I'd say and hope. There are actually a couple of good things about using a shared hosting provider.
»
* reply
Is there a VPS provider (or
Submitted by Ken (not verified) on Fri, 2008/02/22 - 12:08.
Is there a VPS provider (or set of providers) you recommend? I have heard conflicting reports that with these you may run into similar issues you would run into with shared hosting.
»
* reply
I can't speak for Khalid...
Submitted by Rick Vugteveen (not verified) on Sat, 2008/11/29 - 19:51.
... but I'd take a look at either SliceHost or Linode. Both offer low cost "DIY" Xen based VPS hosting. I am a happy SliceHost customer. I came from a shared hosting background and found that their articles section was just what I needed to get going. Most of the tutorials on SliceHost and 2bits focus on Ubuntu, making them complimentary. Best of luck Ken!
»
* reply
Slicehost is good
Submitted by Khalid on Sat, 2008/11/29 - 20:23.
Slicehost is definitely good.
The technology they use is good, and their support is good too.
They have been purchased by Rackspace though. Not sure if this will change things or not. Hopefully not.
»
* reply
We get asked a lot how to make a Drupal site faster. Therefore, we list here some tips on how to achieve such a goal.
These guidelines apply for sites that start to pick up traffic (e.g. more than a few thousand page views a day).
If you have tips to share, please add them as a comment below.
Avoid shared hosting
Shared hosting means that you cannot control, let alone know, the load other web sites impose on the server you are hosted on. Moreover, you cannot tune things such as database variables, nor install additional components (e.g. a PHP accelerator).
So, unless your site receive a few thousands page views a day, avoid shared hosting altogether.
Go for VPS or dedicated servers
With shared hosting eliminated from the picture, you are basically left with either a VPS (Virtual Private Server) or a dedicated server. A VPS is an reserved slice within a real dedicated server that you control yourself, i.e. you have root access, can install any software you like, ..etc.
VPS based on Xen are very cost effective, and provide excellent performance, provided you have enough memory.
At what point should you move from a VPS to a dedicated server? It is hard to say since each site is different, has different module, has different usage pattern, ...etc. But as a general imprecise rule, a VPS should be able to handle up to 50,000 to 75,000 page view per day with acceptable performance. Above that, and you are better off with a properly configured dedicated server with appropriate memory and CPU.
Install as much memory as you can afford
Memory is cheap compared to the expense invovled in hand tuning and squeezing the most out of a limited memory machine. The extra memory not only allows extra Apache processes to run so that you can handle traffic spikes gracefully, but it also buffers the filesystem and the database, so disk I/O is avoided altogether, and queries are fast.
Avoid dedicated with shared database hosts
Some VPS / dedicated hosts do not give you the option of hosting MySQL on your dedicated server itself, and force you to have the database on a separate server, shared with other databases of other servers! This is a very bad setup for a medium to large Drupal site, since Drupal does a lot of database queries per request. If the database server that is assigned to you serves other busy web sites, your VPS / dedicated server will sit idle waiting for the database requests to be served, and your site will be slow.
Dreamhost is one company that forces the database to be on a remote server. Here is a recent example of how an increasingly popular web site was bogged down because of that architecture.
Use a PHP Accelerator/op-code cache
Using a PHP accelerator (op-code cache) is a must for medium to large web site. The popular free ones are APC, eAcceleartor and Xcache. Each has its pros and cons, and depending on the version(s) of PHP you use, some may be more stable than others.
Enable MySQL query cache
The single most important MySQL factor for large sites is a well tuned MySQL query cache. MySQL will store the results of queries in a cache so subsequent requests to get served by, bypassing the need to re-run the queries.
Make sure you have MySQL's query cache configured with an appropriate size.
Avoid the open buffet syndrome for contributed modules
With so many modules available for Drupal, site admins/builders are tempted to install too many of them, either to try them out, or to have extra cool features offered by said modules.
The down side is that additional modules add to the size of each Apache process, consuming more memory. Moreover, most modules have their own database tables, and cause more queries to be done, adding some overhead.
As well, some modules have untuned non-scalable code, that buckles down under heavy load.
So, avoid using extra modules as much as you can. Failing that, use modules that are proven to be scalable.
Enable Drupal caching
Drupal has a built in cache mechanism. Some of that cache is always on, and cannot be turned off (e.g. menu, filter and variable caches). The page cache is optional and, for anonymous users, collapses many database queries into one query.
So, if your site has a large proportion of anonymous to registered visitors, enable the page cache. For more performance you can also consider the aggressive cache as well.
Consider file based caching (boost)
There is a file based caching module called boost that stores the cached pages in flat files outside the database. This causes anonymous requests to be served directly by Apache from the file system, and thus avoiding Drupal's bootstrap and database I/O altogether.
Use this module if you know how to setup Apache .htaccess.
Use memcache if you know how to set it up
There is a memcache module which causes the Drupal cache to be in memory rather than in the database, thus avoiding database and disk I/O overhead.
Setting up this module for Drupal 5 is a bit involved, but if you are brave enough, it is worth the effort.
Read articles at 2bits.com
There is much more to be said about performance of a Drupal web site.
2bits posts regular performance tuning and optimization articles. So, go and read about the above points, and much more. You can also subscribe to our articles feed to get these documents as they get posted.
Hire 2bits to tune your Drupal web site
As part of our Drupal consulting services, 2bits offers a performance tuning and optimization service. Use the Contact link to ask for an engagement.
‹ The importance of tuning APC for sites with a high number of Drupal modules up Tools for Performance Tuning and Optimization ›
»
* Add new comment
It's ironic I came across
Submitted by mimo (not verified) on Tue, 2008/09/16 - 07:01.
It's ironic I came across this article while looking for ideas on improving drupal performance as I work as a sysadmin for a small hosting provider.
I think your points are well made but they depend on a premise that hosting providers don't care about individual web sites and don't want to provide the best for their clients. Maybe that's the case. In our case it's not I'd say and hope. There are actually a couple of good things about using a shared hosting provider.
»
* reply
Is there a VPS provider (or
Submitted by Ken (not verified) on Fri, 2008/02/22 - 12:08.
Is there a VPS provider (or set of providers) you recommend? I have heard conflicting reports that with these you may run into similar issues you would run into with shared hosting.
»
* reply
I can't speak for Khalid...
Submitted by Rick Vugteveen (not verified) on Sat, 2008/11/29 - 19:51.
... but I'd take a look at either SliceHost or Linode. Both offer low cost "DIY" Xen based VPS hosting. I am a happy SliceHost customer. I came from a shared hosting background and found that their articles section was just what I needed to get going. Most of the tutorials on SliceHost and 2bits focus on Ubuntu, making them complimentary. Best of luck Ken!
»
* reply
Slicehost is good
Submitted by Khalid on Sat, 2008/11/29 - 20:23.
Slicehost is definitely good.
The technology they use is good, and their support is good too.
They have been purchased by Rackspace though. Not sure if this will change things or not. Hopefully not.
»
* reply
How to measure the execution time of your Drupal scripts
How to measure the execution time of your Drupal scripts
While doing Drupal development, you sometimes might want to find out how fast or slow a certain part in your code is. Did you know that Drupal has 3 easy functions to do this built right in to core, timer_start($name), timer_read($name) and timer_stop($name).
You can have multiple timers in one run and each one of them is identified by the $name parameter you have to pass to each of these three functions.
I think the functions are pretty self explanatory:
* timer_start starts a timer
* timer_read tells you how long the timer has been running
* and timer_stop stops the timer and returns an array that contains the number of times the timer has been started and stopped (count) and the accumulated timer value in ms (time).
While doing Drupal development, you sometimes might want to find out how fast or slow a certain part in your code is. Did you know that Drupal has 3 easy functions to do this built right in to core, timer_start($name), timer_read($name) and timer_stop($name).
You can have multiple timers in one run and each one of them is identified by the $name parameter you have to pass to each of these three functions.
I think the functions are pretty self explanatory:
* timer_start starts a timer
* timer_read tells you how long the timer has been running
* and timer_stop stops the timer and returns an array that contains the number of times the timer has been started and stopped (count) and the accumulated timer value in ms (time).
How to truly disable drupal cache
How to truly disable drupal cache
Drupal, like most CMSes uses cache intensively to speed things up. In a production site that is exactly what you what. However when developing, cache can cause you much pain.
If your an expert in drupal you probably already realized when you need to clear the cache in order to see changes and when not too. the developer module even makes it quite easy to do. However for me, and I am quite sure to quite a lot of other people it is not that clear.
"Wait", you might say. "Drupal has a disable cache setting". You would be right to say that. Only it doesn't completely disable the cache, only part of the cache, namely the page caching. To truly disable cache you need to do a bit of a hacking to the cache code. Yes I know they say to never hack core (cache is part of the core drupal release). for me this was worth it. I figure as long as you remember its there and don't use it in production you should be fine. The following piece of code needs to be added in include/cache.inc. There are two functions there that you need to edit:
* cache_get
* cache_set
In cache_get place the code right after the global $user. In cache_set place it at the start of the function. Here is the code that you need to add.
Kudos for this hack goes to Rolf van der Krol. Cheers mate.
Drupal, like most CMSes uses cache intensively to speed things up. In a production site that is exactly what you what. However when developing, cache can cause you much pain.
If your an expert in drupal you probably already realized when you need to clear the cache in order to see changes and when not too. the developer module even makes it quite easy to do. However for me, and I am quite sure to quite a lot of other people it is not that clear.
"Wait", you might say. "Drupal has a disable cache setting". You would be right to say that. Only it doesn't completely disable the cache, only part of the cache, namely the page caching. To truly disable cache you need to do a bit of a hacking to the cache code. Yes I know they say to never hack core (cache is part of the core drupal release). for me this was worth it. I figure as long as you remember its there and don't use it in production you should be fine. The following piece of code needs to be added in include/cache.inc. There are two functions there that you need to edit:
* cache_get
* cache_set
In cache_get place the code right after the global $user. In cache_set place it at the start of the function. Here is the code that you need to add.
if(variable_get('cache', CACHE_DISABLED) == CACHE_DISABLED) {
return 0;
}
Kudos for this hack goes to Rolf van der Krol. Cheers mate.
Tuesday, October 20, 2009
solution stack
solution stack
In computing, a solution stack is a set of software subsystems or components needed to deliver a fully functional solution, e.g. a product or service.
For example, to develop a web application, the designer needs to use an operating system, web server, database and programming language. Another version of a solution stack is operating system, middleware, database, and applications.[1]
One of the many possible solution stacks available is LAMP:[2]
* Linux (the operating system)
* Apache (the web server)
* MySQL (the database management system)
* Perl, PHP, or Python (scripting languages)
In computing, a solution stack is a set of software subsystems or components needed to deliver a fully functional solution, e.g. a product or service.
For example, to develop a web application, the designer needs to use an operating system, web server, database and programming language. Another version of a solution stack is operating system, middleware, database, and applications.[1]
One of the many possible solution stacks available is LAMP:[2]
* Linux (the operating system)
* Apache (the web server)
* MySQL (the database management system)
* Perl, PHP, or Python (scripting languages)
Enable MySQL Compression
Enable MySQL Compression
garym@teledyn.com - October 22, 2004 - 00:55
Project: Drupal
Version: 7.x-dev
Component: database system
Category: feature request
Priority: normal
Assigned: Unassigned
Status: postponed
Jump to:
* Most recent comment
Description
A tiny change to the mysql connect can result in twice the speed overall on connections; I went from an average of 2.7sec per cached homepage to just over 1.4sec by simply adding this switch.
apparently MySQL has had this MYSQL_CLIENT_COMPRESS for a long time, but yesterday was the first I'd ever heard of it.
Attachment Size
Attachment Size
mysql-compress.patch 1.19 KB
» Login or register to post comments
#1
killes@www.drop.org - October 22, 2004 - 09:37
It is not clear to me why this should help. Most of us do run mysql and apache on the same server. Can you explain that to me?
Login or register to post comments
#2
garym@teledyn.com - October 24, 2004 - 01:23
I'm not completely hip to the inner workings of unix sockets, but I do know that a socket is a file handle and I know MySQL has a hard limit on the number of connections; any change that minimizes the length of a transaction means you tie up a precious file-handle for less time.
Is it fair to say "Most of us do run mysql and apache on the same server"? I run 7 Drupals myself, all of them on a webhost who locates MySQL on a dedicated machine, and I admin on 3 Drupals for clients, all of whom have MySQL on other machines, either for the same webhosts-rules reason, or in one case because they have large dreams.
Also, when a site becomes popular, moving MySQL to another machine is the easiest way for Drupal to be served over a multi-headed cluster (ie two drupal servers fed from a third MySQL server). The patch allows Drupal to scale.
The patch, so far as I know, is at worst benign on a single host, it should still show a performance boost at the cost of more CPU consumption due to less traffic down the interprocess pipes. It appears to boost the page speed on my dev server, but this machine is too under-powered and otherwise too loaded to really say by how much.
At the very least, as with the pconnect advice, perhaps this connect flag option should be mentioned in the comments.
Login or register to post comments
#3
bertboerland@ww... - October 24, 2004 - 12:29
I'll have to say that anything that will make it more easy and speedy to run drupal on a 2 (or 3) tier framework will make it better. So indeed for most users with relative small sites it will be enough to have the webserver and database server on the same host (with lots of other processes as well). But for serious hosting scaling out it the easiest way of taking load.
Scaling out being having more host instead of faster hosts. It's more easy for operations, you can take one webserver offline while the farm will still handle all request with the loadbalancer in front of it.
If drupal wants to grow, layout, content anf logic must be more seperatable, also in hardware. So I am for optional compression from the weblayer towards the database layer and hope it makes it in the code.
Login or register to post comments
#4
Anonymous - October 25, 2004 - 15:29
It is seriously unclear to me why this would ever be a good idea except on very slow connections between web server and database. And if you have a very slow connection, you shouldn't be trying to run a cluster of HTTP servers for your site, as scaling the web end of it will be the least of your worries.
It should be clear to everyone that provided the connection is quick enough (and 100/1000Mbit ethernet should be plenty) compression will merely slow things down - it's just extra overhead for both the MySQL server and the PHP client - you still need to pass as much data around, you're just adding the overhead of compressing and decompressing it when you shouldn't need to because the link in the middle is plenty fast enough to cope with it being uncompressed. A cached page is unlikely to be more than about 100Kb of HTML. Pulling 100Kb it over a 100Mbit link should take you about 1/100th of a second. This is seriously unlikely to be your bottleneck. :-)
If you're running the MySQL and HTTP processes on the same box then you will be compressing/decompressing stuff over what is effectively an infinite bandwidth pipe, which is extremely stupid because it will always slow things down. If you benchmark this and genuinely find that there is a speed increase in this case I would be astonished.
If you have quick servers that are idle enough that they can compress things faster than they can send them over ethernet and the bottleneck really is a 100MBIT connection in the middle, then we have a serious issue with the amount of stuff Drupal is pulling off the database and should look into that. Additionally, your servers evidently won't be under enough load for page generation to take more than a few ms anyway.
Login or register to post comments
#5
robertDouglass - October 25, 2004 - 15:37
There is not really much to discuss here, only things to benchmark. It is clear that this needs to be benchmarked for at least two cases: MySQL on the same box and MySQL on a different box. I listen to benchmarks much more closely than I listen to arguments why something may or may not be better than something else. The only question is, who is going to do the benchmarks and how? I am not volunteering as I don't have the right setup to test both cases objectively.
Login or register to post comments
#6
Anonymous - October 25, 2004 - 16:03
I think this discussion is fairly irrelevant, as well as would be most benchmarks.
Let me be a little more specific here.
A) Any person with enough brain cells left alive should be able to crack open the code, and add the 'MYSQL_CLIENT_COMPRESS' parameter to the end of the mysql_connect function; if you’re already optimizing, you will know about the existence of the feature since you’ve read the documentation.
B) Compression is a tradeoff between transfer time (size of data being transferred) and CPU usage (the cycles spent on compressing and decompressing the data), the compression is not free; even if it looks like a good idea when loading *one* page (due to less data exchanged, and much CPU available), it will not be a good idea when the CPU is loaded like hell on a busy server.
C) as far as the unix sockets and blocking argument in the thread, I’d say it's both mostly false, and additionally should be resolved by using persistent connections (mysql_pconnect), rather than trying to 'shorten the time we spend connected to the socket'.
-rL
Login or register to post comments
#7
Anonymous - October 25, 2004 - 16:04
I think this discussion is fairly irrelevant, as well as would be most benchmarks.
Let me be a little more specific here.
A) Any person with enough brain cells left alive should be able to crack open the code, and add the 'MYSQL_CLIENT_COMPRESS' parameter to the end of the mysql_connect function; if you’re already optimizing, you will know about the existence of the feature since you’ve read the documentation.
B) Compression is a tradeoff between transfer time (size of data being transferred) and CPU usage (the cycles spent on compressing and decompressing the data), the compression is not free; even if it looks like a good idea when loading *one* page (due to less data exchanged, and much CPU available), it will not be a good idea when the CPU is loaded like hell on a busy server.
C) as far as the unix sockets and blocking argument in the thread, I’d say it's both mostly false, and additionally should be resolved by using persistent connections (mysql_pconnect), rather than trying to 'shorten the time we spend connected to the socket'.
-rL
Login or register to post comments
#8
garym@teledyn.com - October 26, 2004 - 04:08
Hmmm ... I certainly have 100Mbit between these Solaris servers, and a physical distance of only a few meters, and while I'm not privy to the mysql server, the webserver generally runs pretty warm, with unix load generally at least 3 and often between 10 and 20, and yet, using the Apache Benchmark, on the live loaded server (which is also seeing other traffic) I clearly see 100% increase in speed (ie, pages generated in half the time or better) with just this one change.
Curiously also, prior to adding this flag, users complained of "click doing nothing" by which I assume they mean they click on a link and the browser's busy-icon whirrs a bit, and then stops because the connection timed out or was aborted somewhere along the path. With just this one switch change the frequency of these no-action clicks dropped from about three out of four to less than one in twenty. Monitoring the load (using 'top') I also would normally see the baseline server load of 1+ climb to 5+ when I enable my drupal path, but with this flag the load only increased to 3+ ... this last result could simply be coincidence as my site was often assailed by periodic swarms of both spammers and RSS-hounds.
Given what's posted above, I've insufficient brain cells to explain why all this performance improvement should be so.
But again, I'm no expert and these are not controlled laboratory benchmark tests, just an observed change in behaviour on a production server under live-load real-world conditions, and I can only really say that it worked for me. pconnect, on the other hand and while counter-intuitive to its definiton, made matters dramatically worse, leaving open file handles scattered in all directions, eventually causing the servers to completely seize up and reboot. My webhost still hates me for that little experiment ...
Login or register to post comments
#9
Anonymous - October 26, 2004 - 14:30
I suspect these results are specific to your scenario; the supposed performance improvement is probably a result of an 'unexpected' interaction in your specific setup. It could be an indication of network problems (and thus, for example, the compression results in less data transferred, thus less retransmitions, thus seemingly better performance, etc.), or of some artifacts created by your operating system (leaky management of sockets, file handles, memory, or that army of small gray elves who move the bits and bytes across the busses) which causes the MYSQL_CLIENT_COMPRESS to create a side-effect of it dissapearing.
With all that said, if it works for you - and if you lack the desire/time/resources to reproduce this in a clean setup with no unknowns to get to the bottom of this - then well, you found a clever hack which works in your situation! Keep that engineering spirit alive!
This however is not a reason to implement this as part of the default Drupal setup. The outcome, although positive for you scenario, still defies "the way things work" (TM) in more ways than one.
Imho, this should be left at that, and possibly serve people in the future stumbling accross this thread in search of 'things I can try to make things better'.
- rL
Login or register to post comments
#10
Steven - December 24, 2004 - 06:37
This isn't going to be patched in, so marking as won'tfix.
Login or register to post comments
#11
kbahey - December 22, 2007 - 05:21
Version: x.y.z » 7.x-dev
Status: won't fix » needs review
I ran into a situation where a client who runs MySQL and PHP on different servers was experience long running queries. Namely, some cache queries took hundreds of milliseconds. The link between the servers was misconfigured (100MBps instead of 1000MBps), and this did show up when a lot of data was sent from the server to the client.
Perhaps this would not go into core as is, but:
1. We can have a settings.php flag to say compress/don't compress, and those who need it can turn it on.
2. Others can find it useful as is and apply the patch to their setup.
Here is the patch. It is just a one liner and a few comments.
Index: includes/database.mysql.inc
===================================================================
RCS file: /cvs/drupal/drupal/includes/database.mysql.inc,v
retrieving revision 1.85
diff -u -r1.85 database.mysql.inc
--- includes/database.mysql.inc 19 Dec 2007 13:03:16 -0000 1.85
+++ includes/database.mysql.inc 22 Dec 2007 05:13:05 -0000
@@ -77,9 +77,12 @@
// mysql_connect() was called before with the same parameters.
// This is important if you are using two databases on the same
// server.
- // - 2 means CLIENT_FOUND_ROWS: return the number of found
+ // - MYSQL_CLIENT_FOUND_ROWS: return the number of found
// (matched) rows, not the number of affected rows.
- $connection = @mysql_connect($url['host'], $url['user'], $url['pass'], TRUE, 2);
+ // - MYSQL_CLIENT_COMPRESS: compress the data sent from MySQL server
+ // to the client. This can speed things when MySQL and PHP are on
+ // different servers and have a relatively slow link.
+ $connection = @mysql_connect($url['host'], $url['user'], $url['pass'], TRUE, MYSQL_CLIENT_FOUND_ROWS|MYSQL_CLIENT_COMPRESS);
if (!$connection || !mysql_select_db(substr($url['path'], 1))) {
// Show error screen otherwise
_db_error_page(mysql_error());
Login or register to post comments
#12
KarenS - December 22, 2007 - 17:50
subscribing.
I am on a shared host that has the mysql server on a different machine than the web server and I have no information or control over how that is configured, probably not an uncommon scenario. And I also am plagued with sporadic timeout problems, mostly in some administrative pages that load lots of stuff and in places where big caches are being retrieved or saved (like the Views and CCK caches).
I had no idea this might be my problem or that this was something I could control, so at least documenting it would be beneficial if it sometimes helps. When I get a chance (won't be right now with the holidays) I'll try this out on my setup.
Login or register to post comments
#13
birdmanx35 - February 19, 2008 - 18:25
If this has performance benefits, it's worth talking about.
Login or register to post comments
#14
fysa - May 13, 2008 - 22:59
subscribed, awaiting adventurous benchmarkee
Login or register to post comments
#15
fysa - May 14, 2008 - 02:00
FYI: 11.11 requests per second -> 7.77 requests per second with both apache2 and mysql on the same 5.7 box. ;)
Login or register to post comments
#16
Crell - August 21, 2008 - 22:40
Status: needs review » needs work
The path is rather guaranteed to not apply now. :-)
However, with the new array-based DB configuration it should be possible to add support for a "compressed" => TRUE flag that only MySQL pays attention to. If it's set, run the code to enable compression. If not, don't.
If there is sometimes a benefit and sometimes not, a documented toggle sounds like the best way to go.
Login or register to post comments
#17
chx - August 22, 2008 - 08:56
http://bugs.php.net/bug.php?id=44135&edit=1 until this is fixed I doubt this one can be done.
Login or register to post comments
#18
ajayg - January 4, 2009 - 05:40
For whatever it is worth. This small change significantly helped in performance as documented here http://groups.drupal.org/node/17913. USing drupal 5.12
Login or register to post comments
#19
Dave Reid - January 4, 2009 - 07:26
Status: needs work » postponed
I like the idea, but I've confirmed there's currently no way to enable MySQL compression with PDO, so marking as posponed.
Login or register to post comments
#20
kbahey - January 4, 2009 - 15:34
Adding link where it says it is not supported.
Bummer ...
Login or register to post comments
#21
ajayg - January 4, 2009 - 18:07
Hmm. It worked fine for mysqli and mysql. So what is PDO? (sorry for a newbie question but I have always used either mysql or mysqli with drupal).
Login or register to post comments
#22
kbahey - January 4, 2009 - 18:33
PDO is the new database layer for Drupal 7.x.
Login or register to post comments
#23
mrfelton - September 19, 2009 - 14:08
I too would like to see this benchmarked and tested.
garym@teledyn.com - October 22, 2004 - 00:55
Project: Drupal
Version: 7.x-dev
Component: database system
Category: feature request
Priority: normal
Assigned: Unassigned
Status: postponed
Jump to:
* Most recent comment
Description
A tiny change to the mysql connect can result in twice the speed overall on connections; I went from an average of 2.7sec per cached homepage to just over 1.4sec by simply adding this switch.
apparently MySQL has had this MYSQL_CLIENT_COMPRESS for a long time, but yesterday was the first I'd ever heard of it.
Attachment Size
Attachment Size
mysql-compress.patch 1.19 KB
» Login or register to post comments
#1
killes@www.drop.org - October 22, 2004 - 09:37
It is not clear to me why this should help. Most of us do run mysql and apache on the same server. Can you explain that to me?
Login or register to post comments
#2
garym@teledyn.com - October 24, 2004 - 01:23
I'm not completely hip to the inner workings of unix sockets, but I do know that a socket is a file handle and I know MySQL has a hard limit on the number of connections; any change that minimizes the length of a transaction means you tie up a precious file-handle for less time.
Is it fair to say "Most of us do run mysql and apache on the same server"? I run 7 Drupals myself, all of them on a webhost who locates MySQL on a dedicated machine, and I admin on 3 Drupals for clients, all of whom have MySQL on other machines, either for the same webhosts-rules reason, or in one case because they have large dreams.
Also, when a site becomes popular, moving MySQL to another machine is the easiest way for Drupal to be served over a multi-headed cluster (ie two drupal servers fed from a third MySQL server). The patch allows Drupal to scale.
The patch, so far as I know, is at worst benign on a single host, it should still show a performance boost at the cost of more CPU consumption due to less traffic down the interprocess pipes. It appears to boost the page speed on my dev server, but this machine is too under-powered and otherwise too loaded to really say by how much.
At the very least, as with the pconnect advice, perhaps this connect flag option should be mentioned in the comments.
Login or register to post comments
#3
bertboerland@ww... - October 24, 2004 - 12:29
I'll have to say that anything that will make it more easy and speedy to run drupal on a 2 (or 3) tier framework will make it better. So indeed for most users with relative small sites it will be enough to have the webserver and database server on the same host (with lots of other processes as well). But for serious hosting scaling out it the easiest way of taking load.
Scaling out being having more host instead of faster hosts. It's more easy for operations, you can take one webserver offline while the farm will still handle all request with the loadbalancer in front of it.
If drupal wants to grow, layout, content anf logic must be more seperatable, also in hardware. So I am for optional compression from the weblayer towards the database layer and hope it makes it in the code.
Login or register to post comments
#4
Anonymous - October 25, 2004 - 15:29
It is seriously unclear to me why this would ever be a good idea except on very slow connections between web server and database. And if you have a very slow connection, you shouldn't be trying to run a cluster of HTTP servers for your site, as scaling the web end of it will be the least of your worries.
It should be clear to everyone that provided the connection is quick enough (and 100/1000Mbit ethernet should be plenty) compression will merely slow things down - it's just extra overhead for both the MySQL server and the PHP client - you still need to pass as much data around, you're just adding the overhead of compressing and decompressing it when you shouldn't need to because the link in the middle is plenty fast enough to cope with it being uncompressed. A cached page is unlikely to be more than about 100Kb of HTML. Pulling 100Kb it over a 100Mbit link should take you about 1/100th of a second. This is seriously unlikely to be your bottleneck. :-)
If you're running the MySQL and HTTP processes on the same box then you will be compressing/decompressing stuff over what is effectively an infinite bandwidth pipe, which is extremely stupid because it will always slow things down. If you benchmark this and genuinely find that there is a speed increase in this case I would be astonished.
If you have quick servers that are idle enough that they can compress things faster than they can send them over ethernet and the bottleneck really is a 100MBIT connection in the middle, then we have a serious issue with the amount of stuff Drupal is pulling off the database and should look into that. Additionally, your servers evidently won't be under enough load for page generation to take more than a few ms anyway.
Login or register to post comments
#5
robertDouglass - October 25, 2004 - 15:37
There is not really much to discuss here, only things to benchmark. It is clear that this needs to be benchmarked for at least two cases: MySQL on the same box and MySQL on a different box. I listen to benchmarks much more closely than I listen to arguments why something may or may not be better than something else. The only question is, who is going to do the benchmarks and how? I am not volunteering as I don't have the right setup to test both cases objectively.
Login or register to post comments
#6
Anonymous - October 25, 2004 - 16:03
I think this discussion is fairly irrelevant, as well as would be most benchmarks.
Let me be a little more specific here.
A) Any person with enough brain cells left alive should be able to crack open the code, and add the 'MYSQL_CLIENT_COMPRESS' parameter to the end of the mysql_connect function; if you’re already optimizing, you will know about the existence of the feature since you’ve read the documentation.
B) Compression is a tradeoff between transfer time (size of data being transferred) and CPU usage (the cycles spent on compressing and decompressing the data), the compression is not free; even if it looks like a good idea when loading *one* page (due to less data exchanged, and much CPU available), it will not be a good idea when the CPU is loaded like hell on a busy server.
C) as far as the unix sockets and blocking argument in the thread, I’d say it's both mostly false, and additionally should be resolved by using persistent connections (mysql_pconnect), rather than trying to 'shorten the time we spend connected to the socket'.
-rL
Login or register to post comments
#7
Anonymous - October 25, 2004 - 16:04
I think this discussion is fairly irrelevant, as well as would be most benchmarks.
Let me be a little more specific here.
A) Any person with enough brain cells left alive should be able to crack open the code, and add the 'MYSQL_CLIENT_COMPRESS' parameter to the end of the mysql_connect function; if you’re already optimizing, you will know about the existence of the feature since you’ve read the documentation.
B) Compression is a tradeoff between transfer time (size of data being transferred) and CPU usage (the cycles spent on compressing and decompressing the data), the compression is not free; even if it looks like a good idea when loading *one* page (due to less data exchanged, and much CPU available), it will not be a good idea when the CPU is loaded like hell on a busy server.
C) as far as the unix sockets and blocking argument in the thread, I’d say it's both mostly false, and additionally should be resolved by using persistent connections (mysql_pconnect), rather than trying to 'shorten the time we spend connected to the socket'.
-rL
Login or register to post comments
#8
garym@teledyn.com - October 26, 2004 - 04:08
Hmmm ... I certainly have 100Mbit between these Solaris servers, and a physical distance of only a few meters, and while I'm not privy to the mysql server, the webserver generally runs pretty warm, with unix load generally at least 3 and often between 10 and 20, and yet, using the Apache Benchmark, on the live loaded server (which is also seeing other traffic) I clearly see 100% increase in speed (ie, pages generated in half the time or better) with just this one change.
Curiously also, prior to adding this flag, users complained of "click doing nothing" by which I assume they mean they click on a link and the browser's busy-icon whirrs a bit, and then stops because the connection timed out or was aborted somewhere along the path. With just this one switch change the frequency of these no-action clicks dropped from about three out of four to less than one in twenty. Monitoring the load (using 'top') I also would normally see the baseline server load of 1+ climb to 5+ when I enable my drupal path, but with this flag the load only increased to 3+ ... this last result could simply be coincidence as my site was often assailed by periodic swarms of both spammers and RSS-hounds.
Given what's posted above, I've insufficient brain cells to explain why all this performance improvement should be so.
But again, I'm no expert and these are not controlled laboratory benchmark tests, just an observed change in behaviour on a production server under live-load real-world conditions, and I can only really say that it worked for me. pconnect, on the other hand and while counter-intuitive to its definiton, made matters dramatically worse, leaving open file handles scattered in all directions, eventually causing the servers to completely seize up and reboot. My webhost still hates me for that little experiment ...
Login or register to post comments
#9
Anonymous - October 26, 2004 - 14:30
I suspect these results are specific to your scenario; the supposed performance improvement is probably a result of an 'unexpected' interaction in your specific setup. It could be an indication of network problems (and thus, for example, the compression results in less data transferred, thus less retransmitions, thus seemingly better performance, etc.), or of some artifacts created by your operating system (leaky management of sockets, file handles, memory, or that army of small gray elves who move the bits and bytes across the busses) which causes the MYSQL_CLIENT_COMPRESS to create a side-effect of it dissapearing.
With all that said, if it works for you - and if you lack the desire/time/resources to reproduce this in a clean setup with no unknowns to get to the bottom of this - then well, you found a clever hack which works in your situation! Keep that engineering spirit alive!
This however is not a reason to implement this as part of the default Drupal setup. The outcome, although positive for you scenario, still defies "the way things work" (TM) in more ways than one.
Imho, this should be left at that, and possibly serve people in the future stumbling accross this thread in search of 'things I can try to make things better'.
- rL
Login or register to post comments
#10
Steven - December 24, 2004 - 06:37
This isn't going to be patched in, so marking as won'tfix.
Login or register to post comments
#11
kbahey - December 22, 2007 - 05:21
Version: x.y.z » 7.x-dev
Status: won't fix » needs review
I ran into a situation where a client who runs MySQL and PHP on different servers was experience long running queries. Namely, some cache queries took hundreds of milliseconds. The link between the servers was misconfigured (100MBps instead of 1000MBps), and this did show up when a lot of data was sent from the server to the client.
Perhaps this would not go into core as is, but:
1. We can have a settings.php flag to say compress/don't compress, and those who need it can turn it on.
2. Others can find it useful as is and apply the patch to their setup.
Here is the patch. It is just a one liner and a few comments.
Index: includes/database.mysql.inc
===================================================================
RCS file: /cvs/drupal/drupal/includes/database.mysql.inc,v
retrieving revision 1.85
diff -u -r1.85 database.mysql.inc
--- includes/database.mysql.inc 19 Dec 2007 13:03:16 -0000 1.85
+++ includes/database.mysql.inc 22 Dec 2007 05:13:05 -0000
@@ -77,9 +77,12 @@
// mysql_connect() was called before with the same parameters.
// This is important if you are using two databases on the same
// server.
- // - 2 means CLIENT_FOUND_ROWS: return the number of found
+ // - MYSQL_CLIENT_FOUND_ROWS: return the number of found
// (matched) rows, not the number of affected rows.
- $connection = @mysql_connect($url['host'], $url['user'], $url['pass'], TRUE, 2);
+ // - MYSQL_CLIENT_COMPRESS: compress the data sent from MySQL server
+ // to the client. This can speed things when MySQL and PHP are on
+ // different servers and have a relatively slow link.
+ $connection = @mysql_connect($url['host'], $url['user'], $url['pass'], TRUE, MYSQL_CLIENT_FOUND_ROWS|MYSQL_CLIENT_COMPRESS);
if (!$connection || !mysql_select_db(substr($url['path'], 1))) {
// Show error screen otherwise
_db_error_page(mysql_error());
Login or register to post comments
#12
KarenS - December 22, 2007 - 17:50
subscribing.
I am on a shared host that has the mysql server on a different machine than the web server and I have no information or control over how that is configured, probably not an uncommon scenario. And I also am plagued with sporadic timeout problems, mostly in some administrative pages that load lots of stuff and in places where big caches are being retrieved or saved (like the Views and CCK caches).
I had no idea this might be my problem or that this was something I could control, so at least documenting it would be beneficial if it sometimes helps. When I get a chance (won't be right now with the holidays) I'll try this out on my setup.
Login or register to post comments
#13
birdmanx35 - February 19, 2008 - 18:25
If this has performance benefits, it's worth talking about.
Login or register to post comments
#14
fysa - May 13, 2008 - 22:59
subscribed, awaiting adventurous benchmarkee
Login or register to post comments
#15
fysa - May 14, 2008 - 02:00
FYI: 11.11 requests per second -> 7.77 requests per second with both apache2 and mysql on the same 5.7 box. ;)
Login or register to post comments
#16
Crell - August 21, 2008 - 22:40
Status: needs review » needs work
The path is rather guaranteed to not apply now. :-)
However, with the new array-based DB configuration it should be possible to add support for a "compressed" => TRUE flag that only MySQL pays attention to. If it's set, run the code to enable compression. If not, don't.
If there is sometimes a benefit and sometimes not, a documented toggle sounds like the best way to go.
Login or register to post comments
#17
chx - August 22, 2008 - 08:56
http://bugs.php.net/bug.php?id=44135&edit=1 until this is fixed I doubt this one can be done.
Login or register to post comments
#18
ajayg - January 4, 2009 - 05:40
For whatever it is worth. This small change significantly helped in performance as documented here http://groups.drupal.org/node/17913. USing drupal 5.12
Login or register to post comments
#19
Dave Reid - January 4, 2009 - 07:26
Status: needs work » postponed
I like the idea, but I've confirmed there's currently no way to enable MySQL compression with PDO, so marking as posponed.
Login or register to post comments
#20
kbahey - January 4, 2009 - 15:34
Adding link where it says it is not supported.
Bummer ...
Login or register to post comments
#21
ajayg - January 4, 2009 - 18:07
Hmm. It worked fine for mysqli and mysql. So what is PDO? (sorry for a newbie question but I have always used either mysql or mysqli with drupal).
Login or register to post comments
#22
kbahey - January 4, 2009 - 18:33
PDO is the new database layer for Drupal 7.x.
Login or register to post comments
#23
mrfelton - September 19, 2009 - 14:08
I too would like to see this benchmarked and tested.
Large result sets vs. compression protocol
Large result sets vs. compression protocol
Posted by shodan | Vote on Planet MySQL
mysql_connect() function in PHP’s MySQL interface (which for reference maps to mysql_real_connect() function in MySQL C API) has a $client_flags parameter since PHP 4.3.0. This parameter is barely known and almost always overlooked but in some cases it could provide a nice boost to your application.
There’s a number of different flags that can be used. We’re interested in a specific one, MYSQL_CLIENT_COMPRESS. This flag tells the client application to enable compression in the network protocol when talking to mysqld. It reduces network traffic but at the cost of some CPU time: server has to compress the data and client has to decompress it. So there’s little sense in using it if your Web application is on the same host as the database.
When the database is on a dedicated server then compression essentially means trading CPU time (on both server and client) for network time. Obviously, if the network is fast enough, the benefit in network time will not outweight the loss in CPU time. The question is, where exactly does the border lie?
It turns out that 100 Mbit link (with 1.4 ms round-trip time) is not fast enough. Oleksandr Typlynski, one of the Sphinx users, has conducted a benchmark, indexing 600 MB of data over 100 Mbit link. The data was textual and compressed well, reducing traffic more than 3 times. With compression, total indexing time reduced to 87 sec from 127 sec. That’s almost 1.5x improvement in total run time. MySQL query time improvement is even greater. On the other hand 1 Gbit link was fast enough; and total run time was 1.2x times worse with compression.
The bottom line: if you’re fetching big result sets to the client, and client and MySQL are on different boxes, and the connection is 100 Mbit, consider using compression. It’s a matter of adding one extra magic constant to your application, but the benefit might be pretty big.
Posted by shodan | Vote on Planet MySQL
mysql_connect() function in PHP’s MySQL interface (which for reference maps to mysql_real_connect() function in MySQL C API) has a $client_flags parameter since PHP 4.3.0. This parameter is barely known and almost always overlooked but in some cases it could provide a nice boost to your application.
There’s a number of different flags that can be used. We’re interested in a specific one, MYSQL_CLIENT_COMPRESS. This flag tells the client application to enable compression in the network protocol when talking to mysqld. It reduces network traffic but at the cost of some CPU time: server has to compress the data and client has to decompress it. So there’s little sense in using it if your Web application is on the same host as the database.
When the database is on a dedicated server then compression essentially means trading CPU time (on both server and client) for network time. Obviously, if the network is fast enough, the benefit in network time will not outweight the loss in CPU time. The question is, where exactly does the border lie?
It turns out that 100 Mbit link (with 1.4 ms round-trip time) is not fast enough. Oleksandr Typlynski, one of the Sphinx users, has conducted a benchmark, indexing 600 MB of data over 100 Mbit link. The data was textual and compressed well, reducing traffic more than 3 times. With compression, total indexing time reduced to 87 sec from 127 sec. That’s almost 1.5x improvement in total run time. MySQL query time improvement is even greater. On the other hand 1 Gbit link was fast enough; and total run time was 1.2x times worse with compression.
The bottom line: if you’re fetching big result sets to the client, and client and MySQL are on different boxes, and the connection is 100 Mbit, consider using compression. It’s a matter of adding one extra magic constant to your application, but the benefit might be pretty big.
Slow MySQL queries on a multi-server setup: use compression
Slow MySQL queries on a multi-server setup: use compression
* Articles
Published Sat, 2007/12/22 - 01:48
A few months ago, we saw something strange at a client. They were facing slow cache queries, such as the following ones.
2507.25 1 cache_get SELECT data, created, headers, expire FROM cache_menu WHERE cid = '1:en'
1303.68 1 cache_get SELECT data, created, headers, expire FROM cache WHERE cid = 'content_type_info'
They are running MySQL on a separate box from the one that has PHP running. Running the same SQL locally on the MySQL box did not show the same slow performance as running it from the PHP box.
Upon closer investigation, it was found to be that they had the link between the boxes set to 10MBps only, instead of the full 1000MBs that it can do. Once the servers were configured for the proper speed, performance became much better.
What is interesting is that there is a MySQL option to get around such a problem on the software level. MySQL provides a flag for the mysql_connect() function that would compress the data sent. See MySQL Performance Blog: Large result sets vs. compression protocol.
This has also been reported more than 3 years ago for Drupal in issue #11891. So, I created a patch for Drupal 7 (applies to Drupal 6 RC1 as well) that you can download and apply. Ideally, this would be a settings.php option that turned on for large sites that are on more than one box.
* Articles
Published Sat, 2007/12/22 - 01:48
A few months ago, we saw something strange at a client. They were facing slow cache queries, such as the following ones.
2507.25 1 cache_get SELECT data, created, headers, expire FROM cache_menu WHERE cid = '1:en'
1303.68 1 cache_get SELECT data, created, headers, expire FROM cache WHERE cid = 'content_type_info'
They are running MySQL on a separate box from the one that has PHP running. Running the same SQL locally on the MySQL box did not show the same slow performance as running it from the PHP box.
Upon closer investigation, it was found to be that they had the link between the boxes set to 10MBps only, instead of the full 1000MBs that it can do. Once the servers were configured for the proper speed, performance became much better.
What is interesting is that there is a MySQL option to get around such a problem on the software level. MySQL provides a flag for the mysql_connect() function that would compress the data sent. See MySQL Performance Blog: Large result sets vs. compression protocol.
This has also been reported more than 3 years ago for Drupal in issue #11891. So, I created a patch for Drupal 7 (applies to Drupal 6 RC1 as well) that you can download and apply. Ideally, this would be a settings.php option that turned on for large sites that are on more than one box.
Trivial uses of Telnet - HTTP
Trivial uses of Telnet - HTTP
HTTP - Browsing page source
Webpages are served up using a protocol called Hypertext Transfer Protocol or HTTP for short, standards suggest that port 80 should be where the http service listens although it is trivial for the administrator to use another port. For the purposes of our little exercise we are going to look at very simple ways to get the webserver to serve up the front page of it's site, there are a myriad of things you can do but we are going to keep it pretty trivial.
Start off by connecting your chosen tool to port 80 of a website (www.example.com for the purposes of this demonstration), to start off with we type the following and press return twice after we are done. If you are using telnet you need to get it right the first time otherwise it will not work correctly.
GET / HTTP/1.0
What we have just asked for is for the system to send us the document (GET) which exists as the front page of the website (/ or root document), and we specify that we just want a simple request without all the extra fuss which I will explain later (HTTP/1.0).
In response the server replied with the following (this is just an example so yours will be different, plus I have tweaked this to make it simpler so some of the numbers will not be accurate):
When the webserver is done telling us about this page it closes the connection since we have the data, and it has other requests to process - when this happens we get this message:
Connection to host lost.
So what have we leant from this? Well everything before the linespace is called the header, and everything below that is called content and normally you only see content since your browser filters out headers for you. Headers however do contain rather a lot of information...
* The site appears to have a standard front page (denoted by the 200 status), a status of 3XX implies you have to go to a second page to find what you are looking for, a status of 4XX implies that there was a problem getting this page (either it was missing or you arent allowed access to it at this time etc.) lastly a status of 5XX would mean that the server had problems processing my request.
* The site appears to be running Microsoft's IIS version 5 (this is only found on windows 2000) so I know what webserver and operating system they appear to be running. IF you haven't guessed you get this from the string that says Microsoft-IIS/5.0.
* We have a Content-Location header which can often give away information such as the internal addresses of machines, full paths to documents on the website and a multitude of other things.
* We have a Last-Modified header which does what you expect it to - details the last date and time that this page was modified, sometimes very useful to see how frequently a website is really updated.
The actual content side of it will generally only give away the errors of the designers but looking out for and identifing those is an entire lession in itself.
The minimum number of details that make up a request you can expect to use and get a valid response back is the following:
However in practice you are more likely to be using a replica set of request data since it fools the website into thinking a browser is visiting their site, and also makes sure that if for some reason the website needs the regular amount of data it has it.
Then you should see the response message:
The main difference is the Host: request header as this allows website hosting companies to put more than one website on an address and have them all accessible (referred to as a virtual server), as if you try to access a virtual server without the Host: line you will not get the site you expect!
Just incase you were curious the other request headers used in that example are:
The choice of which version you want to use comes down to how much effort you want to put into the task, because whereas 1.0 is simple to the point that it is noticably unrealistic, 1.1 is complex but is more believable since this is what a modern browser would us, so will not look out of place. Also it is useful to remember that you cannot access virtual servers using 1.0 since host came in under the 1.1 specification.
HTTP - Browsing page source
Webpages are served up using a protocol called Hypertext Transfer Protocol or HTTP for short, standards suggest that port 80 should be where the http service listens although it is trivial for the administrator to use another port. For the purposes of our little exercise we are going to look at very simple ways to get the webserver to serve up the front page of it's site, there are a myriad of things you can do but we are going to keep it pretty trivial.
Start off by connecting your chosen tool to port 80 of a website (www.example.com for the purposes of this demonstration), to start off with we type the following and press return twice after we are done. If you are using telnet you need to get it right the first time otherwise it will not work correctly.
GET / HTTP/1.0
What we have just asked for is for the system to send us the document (GET) which exists as the front page of the website (/ or root document), and we specify that we just want a simple request without all the extra fuss which I will explain later (HTTP/1.0).
In response the server replied with the following (this is just an example so yours will be different, plus I have tweaked this to make it simpler so some of the numbers will not be accurate):
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Cache-Control: no-cache
Expires: Mon, 24 Dec 2001 00:49:17 GMT
Content-Location: http://10.0.0.100/index.html
Date: Mon, 24 Dec 2001 00:49:17 GMT
Content-Type: text/html
Accept-Ranges: bytes
Last-Modified: Mon, 24 Dec 2001 00:27:03 GMT
ETag: "60fa8cb5118cc11:adc"
Content-Length: 103
<HTML>
<HEAD>
<TITLE>Demo page</TITLE>
</HEAD>
<BODY>
This is just a test.
</BODY>
</HTML>
When the webserver is done telling us about this page it closes the connection since we have the data, and it has other requests to process - when this happens we get this message:
Connection to host lost.
So what have we leant from this? Well everything before the linespace is called the header, and everything below that is called content and normally you only see content since your browser filters out headers for you. Headers however do contain rather a lot of information...
* The site appears to have a standard front page (denoted by the 200 status), a status of 3XX implies you have to go to a second page to find what you are looking for, a status of 4XX implies that there was a problem getting this page (either it was missing or you arent allowed access to it at this time etc.) lastly a status of 5XX would mean that the server had problems processing my request.
* The site appears to be running Microsoft's IIS version 5 (this is only found on windows 2000) so I know what webserver and operating system they appear to be running. IF you haven't guessed you get this from the string that says Microsoft-IIS/5.0.
* We have a Content-Location header which can often give away information such as the internal addresses of machines, full paths to documents on the website and a multitude of other things.
* We have a Last-Modified header which does what you expect it to - details the last date and time that this page was modified, sometimes very useful to see how frequently a website is really updated.
The actual content side of it will generally only give away the errors of the designers but looking out for and identifing those is an entire lession in itself.
HTTP/1.1 vs HTTP/1.0
HTTP/1.1 is a widely used extension to HTTP/1.0 as it allows the client more control over the content it is being delivered, and like most protocols it is over-engineered so much so that there are features built into it that are rarely ever used - they were nice ideas but very few people would implement features; such as only giving out certain types of content if the client can accept them.The minimum number of details that make up a request you can expect to use and get a valid response back is the following:
telnet www.example.com 80
Connected to www.example.com.
Escape character is '^]'.
GET / HTTP/1.1
Host: www.example.com
Connection: Close
However in practice you are more likely to be using a replica set of request data since it fools the website into thinking a browser is visiting their site, and also makes sure that if for some reason the website needs the regular amount of data it has it.
telnet www.example.com 80Example of Show Header Only:
Connected to www.example.com.
Escape character is '^]'.
GET / HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Host: www.example.com
Connection: Close
telnet www.example.com 80
Connected to www.example.com.
Escape character is '^]'.
HEAD / HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Host: www.example.com
Connection: Close
Then you should see the response message:
HTTP/1.1 200 OK
Date: Tue, 20 Oct 2009 16:53:41 GMT
Server: Apache/2.2.11 (FreeBSD) DAV/2 PHP/5.2.10 with Suhosin-Patch
X-Powered-By: PHP/5.2.10
Set-Cookie: SESS93e15ce578546d0a845aa3efdd4d6bde=00d162992b81068ab56f0ce61d164494; expires=Thu, 12-Nov-2009 20:27:01 GMT; path=/; domain=.www.example.com
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Tue, 20 Oct 2009 16:53:41 GMT
Cache-Control: store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 20
Connection: close
Content-Type: text/html; charset=utf-8
The main difference is the Host: request header as this allows website hosting companies to put more than one website on an address and have them all accessible (referred to as a virtual server), as if you try to access a virtual server without the Host: line you will not get the site you expect!
Just incase you were curious the other request headers used in that example are:
- * Accept - gives a list of the types of data that you are in theory willing to accept.
- * Accept-Language - gives a list of the languages that you are in theory willing to accept.
- * Accept-Encoding - gives a list of the encoding methods that you are in theory willing to accept.
- * User-Agent - a string that describes the type of browser you are using.
- * Host - the hostname this request is destined for.
- * Connection - specifies how to handle this request.
The choice of which version you want to use comes down to how much effort you want to put into the task, because whereas 1.0 is simple to the point that it is noticably unrealistic, 1.1 is complex but is more believable since this is what a modern browser would us, so will not look out of place. Also it is useful to remember that you cannot access virtual servers using 1.0 since host came in under the 1.1 specification.
Monday, October 19, 2009
Tuning the Apache MaxClients parameter
Published Sun, 2007/03/04 - 18:32, Updated Sat, 2007/10/27 - 13:40
One thing that can have a really drastic effect on a large site using Apache, is the value assigned to the MaxClients parameter.
This parameter defines how many simultaneous request can be served. Any connection request from browsers that come in after that will be queued.
Apache prefork, StartServers, MaxSpareServers and MinSpareServers
In the most common case, you will be using Apache in the prefork mode, meaning one process per connection, with a pool of processes pre-forked to standby for connections. The number of spare processes is defined by the values MaxSpareServers, MinSpareServers, while the number to start is defined by StartServers.
Maxclients default
By default, the MaxClients parameter has a compiled in hard limit of 256. This can be changed by recompiling Apache however. Some distributions, or hosting companies raise this limit to a very high value, such as 512 or even 1024 in order to cope with large loads.
While this makes sense when the web server is serving static content (plain HTML, images, ...etc.), it can be detrimental to a dynamic web application like Drupal. So often, we have clients calling because their web server has grind to a halt, and the reason would be a too high MaxClients value.
A web site's nemesis: Excessive Thrashing
The reason is that if your web site experiences a traffic spike, or if there is a bottleneck in the database, incoming requests cause new processes to be forked at a rate higher than old processes can service the older connections. This causes a condition where the system keeps creating new processes that overflow the available memory and starts to use the swap space. This almost always causes thrashing, where the system is just swapping pages from physical memory to virtual memory (on disk), and vice versa, without doing any real work. You can detect if thrashing has occurred by using the vmstat command (see our page on tools for performance tunings and optimization for more info).
A simple calculation for MaxClients on a system that does only Drupal would be:
(Total Memory - Operating System Memory - MySQL memory) / Size Per Apache process.
If your hosting company configured your server with all sorts of bells and whistles (like mod_perl, mod_python, in addition to mod_php), then Apache can easily be 21 MB per process. If your server has 512MB, then you can fit some 20 Apache processes. If you tune Apache well, and remove all the unneeded modules, and install a PHP op-code cache/accelerator, then you can make each Apache process take as little as 12 MB. These figures depend on how many modules you have loaded, how big they are, so there is no hard and fast rule. Even if one has 1GB of memory, and leaves 250 MB for the system and MySQL, with an Apache process of 15MB, this means 50 Apache processes can fit in the remaining 750MB.
Remember that you need memory for the operating system, as well as for MySQL. The more you give the system and MySQL memory, the more caching of the file system they do for you and avoid hitting disk, so do not use the very last available memory for MaxClients.
Tuning the ServerLimit
On some systems, there is another parameter that sets an upper limit if MySQL. So for example, if ServerLimit is set by default to 256, and you want to increase MaxClients to 300, you will not be able to do so, until you set ServerLimit to 300 as well. Normally, you would see a warning message from Apache when you restart it to tell you that this needs to be done.
Conclusion
If you cannot do a proper calculation, then it is safest to start with a conservative number, e.g. 60 to 150 on a 2GB system, and then increase it as you monitor the usage of the system over a few weeks. By all means, do not keep it at the 512 value that came with your server/distribution until you know how much load you can handle.
Resources and Links
Apache Performance Tuning. Has a section on MaxClients. Although this document is built with mod_perl in mind, much of it applies to using Apache with PHP as a module.
Performance tuning Apache. Another useful Apache performance tuning document.
How to change the upper limit of MaxClients by recompiling Apache. Not recommended for a dynamic web site.
ProjectOpus blog post by James on Apache2 Maxclients.
Apache performance tuning at devside.net.
‹ Tools for Performance Tuning and Optimization up Using ApacheBench for benchmarking logged in users - an automated approach ›
»
Add new comment
Apache process memory size
Submitted by Visitor (not verified) on Fri, 2009/09/18 - 02:20.
No matter how we try, the process size is always around 60M:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20808 apache 15 0 362m 58m 21m R 27.4 0.7 0:03.20 httpd
20823 apache 16 0 402m 86m 23m S 10.7 1.1 0:01.67 httpd
20811 apache 16 0 348m 43m 21m S 3.8 0.5 0:02.48 httpd
20826 apache 16 0 362m 52m 21m S 3.4 0.7 0:02.34 httpd
I guess this is also influenced by php memory limit we have? (96M)
»
reply
You probably have too many
Submitted by Khalid on Fri, 2009/09/18 - 09:29.
You probably have too many modules.
Please see: server indigestion: the Drupal contributed modules Open Buffet Binge syndrome.
You can reduce the total memory required on the machine by running PHP as FastCGI, see Apache fcgid: acceptable performance and better resource utilization.
»
reply
Thanks man you really solved
Submitted by ravi (not verified) on Thu, 2009/06/18 - 10:20.
Thanks man you really solved my problem
i am always do wrong
for maxclient
and serverLimit
thanks for share
»
reply
how big are my apache processes?
Submitted by wrb123 (not verified) on Tue, 2009/02/03 - 22:46.
when making this calculation, how do i accurately find my size in MB of each apache process? for example, if i run top -d 1 and look at the results, i have different sizes for each process: VIRT, RES, and SHR. it looks like in top, my total memory usage is in line with SHR, the smallest of the memory numbers. is it safe to make this calculation based on apache process SHR size?
»
reply
what are the required modules for apache?
Submitted by Visitor (not verified) on Tue, 2008/10/21 - 19:28.
Quote:
If your hosting company configured your server with all sorts of bells and whistles (like mod_perl, mod_python, in addition to mod_php), then Apache can easily be 21 MB per process.
== end of quote==
How do I know what apache modules are required to run drupal websites?
we have a dedicated server and root access, and we would like to have the apache modules configured specifically for drupal.
is there a "list" of necessary apache modules that we should have, while disabling all others?
Thanks for such a lovely article
»
reply
Apache modules
Submitted by Khalid on Tue, 2008/10/21 - 19:38.
The only required module is php (mod_php).
In addition to that, having the rewrite module will help your site by using clean URLs.
Also, the deflate module can save you some bandwidth.
--
2bits -- Drupal consulting
»
reply
One thing that can have a really drastic effect on a large site using Apache, is the value assigned to the MaxClients parameter.
This parameter defines how many simultaneous request can be served. Any connection request from browsers that come in after that will be queued.
Apache prefork, StartServers, MaxSpareServers and MinSpareServers
In the most common case, you will be using Apache in the prefork mode, meaning one process per connection, with a pool of processes pre-forked to standby for connections. The number of spare processes is defined by the values MaxSpareServers, MinSpareServers, while the number to start is defined by StartServers.
Maxclients default
By default, the MaxClients parameter has a compiled in hard limit of 256. This can be changed by recompiling Apache however. Some distributions, or hosting companies raise this limit to a very high value, such as 512 or even 1024 in order to cope with large loads.
While this makes sense when the web server is serving static content (plain HTML, images, ...etc.), it can be detrimental to a dynamic web application like Drupal. So often, we have clients calling because their web server has grind to a halt, and the reason would be a too high MaxClients value.
A web site's nemesis: Excessive Thrashing
The reason is that if your web site experiences a traffic spike, or if there is a bottleneck in the database, incoming requests cause new processes to be forked at a rate higher than old processes can service the older connections. This causes a condition where the system keeps creating new processes that overflow the available memory and starts to use the swap space. This almost always causes thrashing, where the system is just swapping pages from physical memory to virtual memory (on disk), and vice versa, without doing any real work. You can detect if thrashing has occurred by using the vmstat command (see our page on tools for performance tunings and optimization for more info).
A simple calculation for MaxClients on a system that does only Drupal would be:
(Total Memory - Operating System Memory - MySQL memory) / Size Per Apache process.
If your hosting company configured your server with all sorts of bells and whistles (like mod_perl, mod_python, in addition to mod_php), then Apache can easily be 21 MB per process. If your server has 512MB, then you can fit some 20 Apache processes. If you tune Apache well, and remove all the unneeded modules, and install a PHP op-code cache/accelerator, then you can make each Apache process take as little as 12 MB. These figures depend on how many modules you have loaded, how big they are, so there is no hard and fast rule. Even if one has 1GB of memory, and leaves 250 MB for the system and MySQL, with an Apache process of 15MB, this means 50 Apache processes can fit in the remaining 750MB.
Remember that you need memory for the operating system, as well as for MySQL. The more you give the system and MySQL memory, the more caching of the file system they do for you and avoid hitting disk, so do not use the very last available memory for MaxClients.
Tuning the ServerLimit
On some systems, there is another parameter that sets an upper limit if MySQL. So for example, if ServerLimit is set by default to 256, and you want to increase MaxClients to 300, you will not be able to do so, until you set ServerLimit to 300 as well. Normally, you would see a warning message from Apache when you restart it to tell you that this needs to be done.
Conclusion
If you cannot do a proper calculation, then it is safest to start with a conservative number, e.g. 60 to 150 on a 2GB system, and then increase it as you monitor the usage of the system over a few weeks. By all means, do not keep it at the 512 value that came with your server/distribution until you know how much load you can handle.
Resources and Links
Apache Performance Tuning. Has a section on MaxClients. Although this document is built with mod_perl in mind, much of it applies to using Apache with PHP as a module.
Performance tuning Apache. Another useful Apache performance tuning document.
How to change the upper limit of MaxClients by recompiling Apache. Not recommended for a dynamic web site.
ProjectOpus blog post by James on Apache2 Maxclients.
Apache performance tuning at devside.net.
‹ Tools for Performance Tuning and Optimization up Using ApacheBench for benchmarking logged in users - an automated approach ›
»
Add new comment
Apache process memory size
Submitted by Visitor (not verified) on Fri, 2009/09/18 - 02:20.
No matter how we try, the process size is always around 60M:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20808 apache 15 0 362m 58m 21m R 27.4 0.7 0:03.20 httpd
20823 apache 16 0 402m 86m 23m S 10.7 1.1 0:01.67 httpd
20811 apache 16 0 348m 43m 21m S 3.8 0.5 0:02.48 httpd
20826 apache 16 0 362m 52m 21m S 3.4 0.7 0:02.34 httpd
I guess this is also influenced by php memory limit we have? (96M)
»
reply
You probably have too many
Submitted by Khalid on Fri, 2009/09/18 - 09:29.
You probably have too many modules.
Please see: server indigestion: the Drupal contributed modules Open Buffet Binge syndrome.
You can reduce the total memory required on the machine by running PHP as FastCGI, see Apache fcgid: acceptable performance and better resource utilization.
»
reply
Thanks man you really solved
Submitted by ravi (not verified) on Thu, 2009/06/18 - 10:20.
Thanks man you really solved my problem
i am always do wrong
for maxclient
and serverLimit
thanks for share
»
reply
how big are my apache processes?
Submitted by wrb123 (not verified) on Tue, 2009/02/03 - 22:46.
when making this calculation, how do i accurately find my size in MB of each apache process? for example, if i run top -d 1 and look at the results, i have different sizes for each process: VIRT, RES, and SHR. it looks like in top, my total memory usage is in line with SHR, the smallest of the memory numbers. is it safe to make this calculation based on apache process SHR size?
»
reply
what are the required modules for apache?
Submitted by Visitor (not verified) on Tue, 2008/10/21 - 19:28.
Quote:
If your hosting company configured your server with all sorts of bells and whistles (like mod_perl, mod_python, in addition to mod_php), then Apache can easily be 21 MB per process.
== end of quote==
How do I know what apache modules are required to run drupal websites?
we have a dedicated server and root access, and we would like to have the apache modules configured specifically for drupal.
is there a "list" of necessary apache modules that we should have, while disabling all others?
Thanks for such a lovely article
»
reply
Apache modules
Submitted by Khalid on Tue, 2008/10/21 - 19:38.
The only required module is php (mod_php).
In addition to that, having the rewrite module will help your site by using clean URLs.
Also, the deflate module can save you some bandwidth.
--
2bits -- Drupal consulting
»
reply
Subscribe to:
Posts (Atom)