If you are reading this, stop. Log into the Informatica support site and download the performance tuning guide. That is the first step. It is not comprehensive as each database structure and corporate infrastructure is different. But the basics are there in detail.
Before you continue, read your thread statistics in the session log. I assure me that this is going to point in the general direction of the issue. Save the thread stats for comparison after tuning. 
Use Grid Computing
If your company is not penny wise and pound foolish, take the time to do a cost estimate on a grid and an integration server load balancer. Without this, which I do not have at the time of this writing, you are constrained beyond any measure your coding can resolve.
Use Concurrency
This is true of all databases, but has a different flavor depending on your target and source structure (flat file to OLTP, OLTP to flat DW, Flat stage to schema data mart, OLTP to flat file). 
Override Tracing
One more option in session properties under config object tab --> Error 
handling section just change Override tracing to 
NORMAL from 
NONE..atleast 25% performacne improvement you can observe.
OLTP
Make sure your OLTP database has no more dependencies than required. For an OLTP design where you are not the database developer, work with the application development team to understand the requirements around all the foreign keys. This helps you determine what the load order for tables can be, as well as allowing you to see what groups of tables can be loaded at the same time. 
Data Marts
For a star schema, only allow dimensions to be linked through the fact. I know that is obvious, but people always ask for one little compromise for one report or another. Load conformed dimensions before all other data mart processing. Then load as many dims at the same time as your database engine can handle. then load the facts. I prefer using post SQL to call a sproc to calculate the dimensions start and end dates. 
Load hub and satellite models using the logic of the Data Vault 2.0 logic. Somewhat similar to loading dims before facts, but you have links to account for as well. It has been so long since I have done, but I think the order is satellites, Hubs, and then links. Look it up.
Line Size
All session tuning revolves around limiting the precision of each port and knowing how big the largest datum will be for each port. For file sources or targets count the max size of each field for a row. That per field count is set as the line sequential length.
DTM Buffer Block Size
 
Increase the DTM buffer block setting in relation to the size of the 
rows. The integration service only allocates two blocks for each source 
and
 target by default. Large volume systems do not play well with this. 
However, over allocation  will gracelessly fail the session for reasons 
that only Informatica understands. To me I would rather have that as an 
option (degraded through put vs failure) rather than have it rammed down
 my throat.
The block size calculation depends on the number of Source and Target and the precision of their rows.
Calculation for SessionBufferBlock
SessionBufferBlock = (((NumberOfSource + NumberOfTarget) * 2))
Calculation for Port Precision
Add the maximum size of the data types all the ports in the source and target. I *think* this excludes the intermediary transformations. Mapping Designer - Target Instance - Ports tab and 
Mapping Designer - Source Instance - Ports tab.
Calculation for Buffer Block Size
BufferBlockSize = 20 * (Total Precision of Source and Target)
Calculation for DTM size to accommodate the calculated BufferBlock Size;
(SessionBufferBlock * BufferBlockSize * number of partition)/(0.9)
WF Mgr - Session - Config Object - Advanced - Default Buffer
 Block Size: How many blocks do I need to provide for all sources and 
targets. Don't use auto is what I was told unless you have one of each.
To quote the guide:
A
 session that contains n partitions, set the DTM Buffer Size to at least
 n times the value for the session with one partition. The Log Manager 
writes a warning message in the session log if the number of memory 
blocks is so small that it causes performance degradation. The Log 
Manager writes this warning message even if the number of memory blocks 
is enough for the session to run 
successfully. The warning message also gives a recommended value.
DTM Buffer Size
Minimum size to allow 20 rows to be processed. The DTM Buffer Size setting specifies the amount of memory the Integration Service uses for the Data Trasnform Manager to buffer memory. Default allocates a minimum of 12 MB for DTM buffer 
memory. When you enter a value without a unit of measure ittakes that to mean bytes 1024 = 1024 byte. UoM you can use are: KB, MB, GB.
You keep the buffer for non-double byte data as 12MB.
You increase the buffer size for Unicode double byte chars to 24MB
Multiple either of these by the number of partitions.
If any block you are writing is of a data type larger than the specified size (e.g. BLOB) increase the size to the max expected for a single insert.
WF Mgr - Session - Properties - DTM Buffer Size:
(NumberOfSource + NumberOfTarget) * 2) * BufferBlockSize * Partitions )/0.9
This equation is from the Performance Analyzer output from the Communities pages.  I have also seen (session Buffer Blocks) = (.9) * (DTM Buffer Size) / (Default Buffer Block Size) * (number of partitions)
Increase the property by multiples of the buffer block size, and then run and time the session after each increase. 
Caches
Some transformations require caching: Agg, Rank, LKP, Join. 
¨ Limit the number of connected input/output and output only ports.
¨ Select the optimal cache directory location that is available to the int service as a power center resource.
¨ Increase the cache sizes.
¨ Use the 64-bit version of PowerCenter to run large cache sessions.
When you attach the session log to a post-session email, enable flat file logging.
Connections
Always use native/manufacturer connections over generic ODBC connections.
Microsoft SQL Server, consult your database documentation for information about how to increase the packet size. Microsoft SQL Server, must also change the packet size in the relational connection object in the Workflow Manager to reflect the database server packet size. the current size can be found using SSMS - Server - Properties - Advanced - Network Packet Size (4096 default I think).
TIPS:
If you are not sure you have a target bottle neck, change the target to a flat file local to the integration server. If performance is better then you need to tune the target checkpoint interval , packet size , database design (indexing, partitioning), or timing in relationship to other operations on the target (indexing, backups).
REFERENCES:
http://aambarish.blogspot.com/2012/05/tuning-sessions-for-better-performance.html
http://makingdatameaningful.com/2012/09/18/data_vault-hubs_links_and_satellites_with_associated_loading_patterns/