- load balance
- how the memory grow when running encoding
- cachegrind and simplescalar cache simulation results
Tuesday, December 14, 2010
Research Paper TODO list
Tuesday, November 9, 2010
Friday, October 22, 2010
Skype call
- resovled the segmetation faults in the p-cblk()
- the col_grp() function, use the Intel vector instructions
- use ASM directive to insert the assebly code in the c source code
- operating 64 bits (Quard word) instead of 32 bits at a time
- compilation: #gcc -O2 -o testcopy testcopy.c vectorcopy.s
- loop unrolling, instead of MOV, ADD(4), we do 4 MOV, and ADD(16)
- gcc could do loop unroolling with certain argument when compile, look up that
- Tasks
- - split_col_grp() mainly data copying, try to enhance
- - try gcc unrolling on jasper, and profile/record the execution time
- - try vector instructions (ASM directive)
- - publis a paper by the end of year
Sunday, October 17, 2010
Research Progress
- Fixed the "segmentation faults" when executing the jasper with more than 2 thread. (error was caused by the incorrect workload distributing index incrementation in 2 nested loops within the enc_cblks() routine.
- The fused loop also give some indication of performance enhancement
- try to fuse more loops if possible
Friday, October 15, 2010
Skype Call today
- clarification on the report
- run the pc+pd 4 for multiple time, see if the result is always bad
- run the pc+pd 2 for multiple time, see if the result is always good
- quote or paraphrase the literature from other source
- segmentation fault related to the image quality?
- diving into the code to correct the seg. fault for multithreading..
- cache issue reported in the previous paper, dose the new version of japser fix the issue, updated, or changed?
- p-dwt, poor performance may caused by the cache read & write between processor => overhead
- whether it's possible to improve the dwt performance on the single core, in the c code level, with special instruction for example.
- looking the dwt 2 loops for the horizontal and vertical filtering, and come up some ideas for improvement, and discuss it in the next week meeting.
- next Tuesday will have the discussion for the course material
Saturday, October 9, 2010
Research Progress
- talked about the p-dwt performance, hoping to get a better one
- the elec871 course reading, chapter 3 cache coherance top, and Cullar book related to chapter 5
- checked the threading in the p-dwt functions. ok
- checked the p-cblk functions.
- thread safety, run jasper with 4 threads, segmentation faults happened a lot, the run is ok with 2 threads.
- updated jasper code in cvs..
Tuesday, September 14, 2010
Research Progress
- Done the Parallel DWT
- Tested on the cats.pnm, and goldenears.pnm on OTTAWA
- Re-write the pointer operation in the parallel region.
- Created the research document for putting the useful thoughts together.
- File operation question in the enc_cblk()
- Wrote the a piece software for parsing the time result file.
Shorthand at the Command Prompt
- / - root directory
- ./ - current directory
- ./command_name - run a command in the current directory when the current directory is not on the path
- ../ - parent directory
- ~ - home directory
- $ - typical prompt when logged in as ordinary user
- # - typical prompt when logged in as root or superuser
- ! - repeat specified command
- !! - repeat previous command
- ^^ - repeat previous command with substitution
- & - run a program in background mode
- [Tab][Tab] - prints a list of all available commands. This is just an example of autocomplete with no restriction on the first letter.
- x[Tab][Tab] - prints a list of all available completions for a command, where the beginning is ``x''
- [Alt][Ctrl][F1] - switch to the first virtual text console
- [Alt][Ctrl][Fn] - switch to the nth virtual text console. Typically, there are six on a Linux PC system.
- [Alt][Ctrl][F7] - switch to the first GUI console, if there is one running. If the graphical console freezes, one can switch to a nongraphical console, kill the process that is giving problems, and switch back to the graphical console using this shortcut.
- [ArrowUp] - scroll through the command history (in bash)
- [Shift][PageUp] - scroll terminal output up. This also works at the login prompt, so you can scroll through your boot messages.
- [Shift][PageDown] - scroll terminal output down
- [Ctrl][Alt][+] - switch to next X server resolution (if the server is set up for more than one resolution)
- [Ctrl][Alt][-] - change to previous X server resolution
- [Ctrl][Alt][BkSpc] - kill the current X server. Used when normal exit is not possible.
- [Ctrl][Alt][Del] - shut down the system and reboot
- [Ctrl]c - kill the current process
- [Ctrl]d - logout from the current terminal
- [Ctrl]s - stop transfer to current terminal
- [Ctrl]q - resume transfer to current terminal. This should be tried if the terminal stops responding.
- [Ctrl]z - send current process to the background
- reset - restore a terminal to its default settings
- [Leftmousebutton] - Hold down left mouse button and drag to highlight text. Releasing the button copies the region to the text buffer under X and (if gpm is installed) in console mode.
- [Middlemousebutton] - Copies text from the text buffer and inserts it at the cursor location. With a two-button mouse, click on both buttons simultaneously. It is necessary for three-button emulation to be enabled, either under gpm or in XF86Config.
Tuesday, August 24, 2010
Research Progress
- fixed Eclipse debugger variable view not updating value problem, by compile the source code without any optimization (i.e. no -o2 set in as the gcc options.)
- tracking in the wavelet transform.
Friday, August 20, 2010
Performance Tools for Software Developers - Application Notes - Intel® IPP JPEG2000 and JasPer in Ksquirrel
Intel® Integrated performance Primitives website
Performance Tools for Software Developers - Application Notes - Intel® IPP JPEG2000 and JasPer in Ksquirrel
- Accelerating JasPer JPEG2000 Decoding in KSquirrel with Intel® Integrated Performance Primitives (Intel® IPP)
- Version Information
- Downloading KSquirrel, JasPer, Intel® IPP JPEG Source Code
- Source Code Modification
- Building KSquirrel Jasper & Intel® IPP JPEG2000
- Running KSquirrel with IPP-based JPEG2000 decoding
- Appendix A - Performance Comparison
- Appendix B - Verifying Correctness
- Appendix C - Known Issues and Limitations
- Appendix D - References
Wednesday, August 18, 2010
Research Progress
- Reprofile the jasper on the cluster, the result is different to the one on the VirtualBox
- Read papers, wavelet transform, entropy coding (3 passes) sections
- Inspecting the Java applet for various demo in the jpeg, may use the jj2000 to understand the coding algorithm.
Friday, August 13, 2010
Research Progress
- Found 2 paper about the jasper parallelization in OpenMP [111] [112]
- Prepare the progress
- Re-run the 4 threads on ottawa.cal.ee.queensu.ca, bw2.pnm generates error (segmetation fault) cannot tell why.
Thursday, August 12, 2010
Research Progress
- Entropy coding
- Three passes, significance pass.
- Debug the code in the sig_pass_step
Tuesday, August 10, 2010
Color in bash
- echo -e '\E[COLOR1;COLOR2mSome text goes here.'
| Color | Foreground | Background |
|---|---|---|
| black | 30 | 40 |
| red | 31 | 41 |
| green | 32 | 42 |
| yellow | 33 | 43 |
| blue | 34 | 44 |
| magenta | 35 | 45 |
| cyan | 36 | 46 |
| white | 37 | 47 |
Example:
- bash$ echo -e '\E[34;47mThis prints in blue.'; tput sgr0
- bash$ echo -e '\E[33;44m'"yellow text on blue background"; tput sgr0
The tput sgr0 restores the terminal settings to normal. Omitting this lets all subsequent output from that particular terminal remain blue.The simplest, and perhaps most useful ANSI escape sequence is bold text, \033[1m ... \033[0m. The \033 represents an escape, the "[1" turns on the bold attribute, while the "[0" switches it off. The "m" terminates each term of the escape sequence.
- bash$ echo -e "\033[1mThis is bold text.\033[0m"
Thursday, August 5, 2010
OpenMP Scheduling
- SCHEDULE: Describes how iterations of the loop are divided among the threads in the team. The default schedule is implementation dependent.
- STATIC
- Loop iterations are divided into pieces of size chunk and then statically assigned to threads. If chunk is not specified, the iterations are evenly (if possible) divided contiguously among the threads.
- DYNAMIC
- Loop iterations are divided into pieces of size chunk, and dynamically scheduled among the threads; when a thread finishes one chunk, it is dynamically assigned another. The default chunk size is 1.
- GUIDED
- For a chunk size of 1, the size of each chunk is proportional to the number of unassigned iterations divided by the number of threads, decreasing to 1. For a chunk size with value k (greater than 1), the size of each chunk is determined in the same way with the restriction that the chunks do not contain fewer than k iterations (except for the last chunk to be assigned, which may have fewer than k iterations). The default chunk size is 1.
- RUNTIME
- The scheduling decision is deferred until runtime by the environment variable OMP_SCHEDULE. It is illegal to specify a chunk size for this clause.
- AUTO
- The scheduling decision is delegated to the compiler and/or runtime system.
- NO WAIT / nowait: If specified, then threads do not synchronize at the end of the parallel loop.
- ORDERED: Specifies that the iterations of the loop must be executed as they would be in a serial program.
- COLLAPSE: Specifies how many loops in a nested loop should be collapsed into one large iteration space and divided according to the schedule clause. The sequential execution of the iterations in all associated loops determines the order of the iterations in the collapsed iteration space.
Tuesday, July 27, 2010
Linux Bash Scripts
- Bash Script example - http://www.cs.iastate.edu/~cs104/notes/scripts.html
- Linux terminal output redirection
$ script.sh &>> temp.txt - [bash v4.0+] File test operators - http://tldp.org/LDP/abs/html/fto.html

Saturday, July 24, 2010
GNU Make
5.7 Recursive Use of make
Recursive use of make means using make as a command in a makefile. This technique is useful when you want separate makefiles for various subsystems that compose a larger system. For example, suppose you have a subdirectory subdir which has its own makefile, and you would like the containing directory's makefile to run make on the subdirectory. You can do it by writing this:
subsystem:
cd subdir && $(MAKE)
or, equivalently, this (see Summary of Options):
subsystem:
$(MAKE) -C subdir
You can write recursive make commands just by copying this example, but there are many things to know about how they work and why, and about how the sub-make relates to the top-level make. You may also find it useful to declare targets that invoke recursive make commands as `.PHONY' (for more discussion on when this is useful, see Phony Targets).
For your convenience, when GNU make starts (after it has processed any -C options) it sets the variable CURDIR to the pathname of the current working directory. This value is never touched by make again: in particular note that if you include files from other directories the value of CURDIR does not change. The value has the same precedence it would have if it were set in the makefile (by default, an environment variable CURDIR will not override this value). Note that setting this variable has no impact on the operation of make (it does not cause make to change its working directory, for example).
- MAKE Variable: The special effects of using `$(MAKE)'.
- Variables/Recursion: How to communicate variables to a sub-
make. - Options/Recursion: How to communicate options to a sub-
make. - -w Option: How the `-w' or `--print-directory' option helps debug use of recursive
makecommands.
Thursday, July 22, 2010
PC Systems Programming Essentials
- Introduction to Binary and Hexadecimal (~27K)
- Binary Operations (~26K)
- Binary Manipulations (~10K)
- Memory in the PC (~23K)
- Calling Interrupts (~21K)
- Hardware Ports (~20K)
Sunday, July 18, 2010
Working Note
- added bof_thread.h to /src/libjasper/include/jasper
- added bof_thread.h to libjasperinclude_HEADERS = \... in the Make file /src/libjasper/include/jasper
- moved #include
to the jasper.h - undo step 1-3, remove bof_thread.h
- added the linkedList struct and functions to the jas_malloc.h
- implemented the function body in the jas_malloc.c
- The imgcmp is using the jasper_seq.c as well, so if the linkedlist is available jasper.c and jasper_seq.c, it will cause the problem in compiling in the imgcmp.c, therefore, the variable and functions for the linkedlist are defined in the jasper_malloc.c which is available for both jasper.c and imgcmp.c
- extern for the variable
- file A - the host, host the variable name and store its value
- file B - the one uses the variable in file A - to read and write
Notes to Jasper Software
1. Linux
==================
- changed the -p to -pg in the CFLAG from all the Makefile from all subdirectory in the Jasper package
=====================================
2. Simple Scalar in cygwin
=====================================
------------------------------------------------------
--------------------------------------------------
2.2 Generate Assembly code in SimpleScalar
--------------------------------------------------
......
=======================
4. thread in sim-mpfast
---------------------------------
- Need this header file
- Need the following library
/mp_simplesim/libssmp.a -o jasper jasper.o ../libjasper/libjasper.la
- Changes made to the Makefile (src/appl/Makefile)
Thursday, June 17, 2010
C Programming Notes
Steve Summit
These notes are part of the UW Experimental College course on Introductory C Programming. They are based on notes prepared (beginning in Spring, 1995) to supplement the book The C Programming Language, by Brian Kernighan and Dennis Ritchie, or K&R as the book and its authors are affectionately known. (The second edition was published in 1988 by Prentice-Hall, ISBN 0-13-110362-8.) These notes are now (as of Winter, 1995-6) intended to be stand-alone, although the sections are still cross-referenced to those of K&R, for the reader who wants to pursue a more in-depth exposition.
Chapter 2: Basic Data Types and Operators
Chapter 3: Statements and Control Flow
Chapter 4: More about Declarations (and Initialization)
Chapter 5: Functions and Program Structure
Chapter 13: Reading the Command Line
Memory Allocation
[source: http://www.eskimo.com/~scs/cclass/notes/sx11.html]
In this chapter, we'll meet malloc, C's dynamic memory allocation function, and we'll cover dynamic memory allocation in some detail.
As we begin doing dynamic memory allocation, we'll begin to see (if we haven't seen it already) what pointers can really be good for. Many of the pointer examples in the previous chapter (those which used pointers to access arrays) didn't do all that much for us that we couldn't have done using arrays. However, when we begin doing dynamic memory allocation, pointers are the only way to go, because what malloc returns is a pointer to the memory it gives us. (Due to the equivalence between pointers and arrays, though, we will still be able to think of dynamically allocated regions of storage as if they were arrays, and even to use array-like subscripting notation on them.)
You have to be careful with dynamic memory allocation. malloc operates at a pretty ``low level''; you will often find yourself having to do a certain amount of work to manage the memory it gives you. If you don't keep accurate track of the memory which malloc has given you, and the pointers of yours which point to it, it's all too easy to accidentally use a pointer which points ``nowhere'', with generally unpleasant results. (The basic problem is that if you assign a value to the location pointed to by a pointer:
*p = 0;and if the pointer p points ``nowhere'', well actually it can be construed to point somewhere, just not where you wanted it to, and that ``somewhere'' is where the 0 gets written. If the ``somewhere'' is memory which is in use by some other part of your program, or even worse, if the operating system has not protected itself from you and ``somewhere'' is in fact in use by the operating system, things could get ugly.)
11.1 Allocating Memory with malloc
Friday, June 11, 2010
Simulating the Jasper software with sim-mpfast
Friday, June 4, 2010
Creatine
Creatine has a very specific effect with very specific training protocols. Arbitrarily adding creatine supplementation without considering training is a huge mistake. Most studies show that a single bout of maximal or sub-maximal effort is not sufficient to elicit a response from creatine supplementation. Creatine has been shown to delay the onset of muscular fatigue during repeated bouts of work A single bout of work appears to have no improvement with creatine supplementation.
This is more than likely due to the role that creatine plays with ATP resynthesis. A single bout of work will deplete ATP stores, yet it is the regeneration of ATP that creatine supplementation affects. Creatine also increases the amount of time that maximal output can be performed - for example, it may increase the duration of a heavy lift, which means more repetitions at the same weight. All of these factors tend to indicate that two major elements are required to benefit from creatine supplementation:
Intensity, in other words, maximal or sub-maximal output duration and repetition - in other words, multiple bouts of work more than likely, these factors are what provided the success of one study, which concluded that enhanced performance and increase of lean mass were due to "higher quality training sessions." These sessions would include moderate to high intensity weights, and moderate to high volume with multiple sets.
Is Creatine Supplementation For Everyone?
Creatine supplementation may not be effective for everyone. There are possible safety concerns with creatine supplementation that will be discussed later. Due to the mechanisms by which creatine supplementation works, it may not be effective for endurance athletes to supplement with creatine. A significant percentage of the general population appears to have no response to creatine.
People on vegetarian diets seem to have a greater response to creatine, theoretically due to the lack of dietary creatine intake. From this, it can be inferred that individuals who consume large amounts of protein on a daily basis, especially red meat, will have a less significant response to creatine supplementation to the amount being ingested through typical dietary means.
It is interesting to note that most creatine research uses the standard protocol of 5 g / d for "maintenance". Anecdotal evidence suggests a high rate of success with creatine supplementation. This same evidence indicates that doses in the field are much higher than the established research protocol or recommended label amounts.
This may account for a higher anecdotal rate of success and perceived effect in the field as opposed to what is suggested in the literature. Anecdotal evidence is not a substitute for scientific research, but should be taken into account. What happens in "the real world" is much more important than what occurs in isolated, scientific trials when trying to make a "real world" application of creatine supplementation.
So What Is The Ideal Creatine Cycle?
Based on the information provided here, I propose the following cycle. The length of an ideal cycle would be relatively short. Many studies suggest that the main response to creatine supplementation occurs during the first week, with subsequent weeks of supplementation rendering no significant increase of performance or mass.
Research is very limited with regard to extended cycles at high doses, however. The cessation of ergogenic effects seems to correlate to the end of the "loading" phase. It is therefore suggested that an extended loading phase may prolong the ergogenic effects. It is also important to cycle off of the product for a prolonged period of time, due to the high dose of the cycle and the potential for contaminants in the product
Supplement Cycle
First, the cycle will be short, only 4 weeks in duration. It will involve a rapid "ramp-up" with a corresponding "ramp-down" of creatine and incorporate glutamine supplementation. Nutrition will be manipulated to favor hypertrophy during the first 3 weeks, then take advantage of super compensation and unloading for the final week.
- First, determine a baseline creatine dose.
- For the average individual, this is proposed to be 0.3 g / kg lean mass.
- For vegetarians, consider 0.4 g / kg lean mass.
- For those with predominant protein (35% of total calories or higher) in the diet, and those who consume at least 1 portion of red meat daily, consider 0.2 g / kg lean mass.
- A discussion of glutamine is outside the scope of this article. The proposed dose is 0.3 g / kg lean mass.
An example individual weighs 180 pounds at 12% body fat. Lean mass is determined to be 158 pounds, or 72 kg. The individual has predominant protein in their diet and consumes red meat frequently. Therefore, the baseline creatine dose is computed to be 72 kg * 0.2 g / kg = 14 grams. Glutamine dose is set at 72 kg * 0.3 g / kg = 22 grams.
Glutamine will be divided into 3 doses: pre-workout, post-workout, and pre-bedtime. This equates to 7 grams pre-workout, 7 grams post-workout, and 8 grams pre-bedtime.
Creatine will be "ramped up". The first week will be 50% of the baseline. Second week is 100% of the baseline, and third week is 150% of the baseline. The unloading week is 50% of the baseline. The creatine will be consumed post-workout (75%) and pre-bedtime (25%). To summarize dosing:
Week 1:
Creatine: 5g post-workout, 2g before bed.
Glutamine: 7g pre-workout, 7g post-workout, 8g before bed.
Week 2:
Creatine: 11g post-workout, 3g before bed.
Glutamine: 7g pre-workout, 7g post-workout, 8g before bed.
Week 3:
Creatine: 16g post-workout, 5g before bed.
Glutamine: 7g pre-workout, 7g post-workout, 8g before bed.
Week 4:
Creatine: 5g post-workout, 2g before bed.
Glutamine: 7g pre-workout, 7g post-workout, 8g before bed.
Week 5:
All supplementation ceases (cycle is complete).
Training Cycle
In order to take advantage of various systems of muscular energetics, a holistic approach is recommended. This approach would involve a series of "mega-sets" (Dr. Fred Hatfield's "Holistic sets" or "ABC training") designed to recruit a broad spectrum of muscle fiber types for each muscle group. An example mega-set for chest might be:
- 6 reps 90% intensity - explosive
10 reps 70% intensity - moderate
40 reps 55% intensity - slow
Intensity is expressed as a percentage of one rep max. If the subject can bench 200 pounds for a single rep, then the mega-set would be:
- 6 reps at 180 pounds - explosive tempo (accelerate as quickly as possible)
10 reps at 140 pounds - steady tempo (1 second down, 1 second up)
40 reps at 110 pounds - slow tempo (3 seconds down, 2 seconds up).
The mega-set is performed with minimal rest - only enough time to strip the weight between mini-sets. After a mega-set, rest no more than 1 minute and repeat the mega-set for a total of three (3) times. Note that these reps are general guidelines. A person with predominantly slow-twitch (endurance) fiber in their chest would have higher reps and may only perform 2 sets, as opposed to another individual with explosive fiber in their chest.
Holistic sets are very taxing on the central nervous system. For this reason, a moderate workout should be used to extend recovery while preventing atrophy. An example schedule for this program:
Sunday, May 30, 2010
JPEG 2000 - Tutorial
JPEG2000_1to50.pdf 09-Nov-2004 06:14 1.0M
-
JPEG2000_51to100.pdf 09-Nov-2004 06:14 1.0M
-
JPEG2000_101to125.pdf 09-Nov-2004 06:14 4.0M
-
JPEG2000_126to150.pdf 09-Nov-2004 06:14 2.8M