TPC-H Dataset – Query Performace Tuning

Splice Machine Community Forum Operational Workloads TPC-H Dataset – Query Performace Tuning

This topic contains 5 replies, has 3 voices, and was last updated by  jyuan 3 weeks, 5 days ago.

Viewing 6 posts - 1 through 6 (of 6 total)
  • Author
    Posts
  • #766258

    rakeshj
    Participant

    I Have installed splicemachine community edition on my 3 node HDP cluster (1 management node + 3 worker nodes).
    I am exploring TPC-H provided queries with 1GB of TPC-H dataset, but few of these queries just hung up , and few of them takes more then 1 to 2 mins to execute.
    e.g. The below Q2 just hung up at our end
    — Minimum Cost Supplier Query (Q2)–

    SELECT
    S.ACCTBAL,
    S.NAME,
    N.NAME,
    P.PARTKEY,
    P.MFGR,
    S.ADDRESS,
    S.PHONE,
    S.COMMENT
    FROM
    PART P,
    SUPPLIER S,
    PARTSUPP PS,
    NATION N,
    REGION R
    WHERE
    P.PARTKEY = PS.PARTKEY
    AND S.SUPPKEY = PS.SUPPKEY
    AND P.SIZE = 15
    AND P.TYPE LIKE ‘%BRASS’
    AND S.NATIONKEY = N.NATIONKEY
    AND N.REGIONKEY = R.REGIONKEY
    AND R.NAME = ‘EUROPE’
    AND PS.SUPPLYCOST = (
    SELECT MIN(PS.SUPPLYCOST)
    FROM
    PARTSUPP PS1, SUPPLIER S1,
    NATION N1, REGION R1
    WHERE
    P.PARTKEY = PS1.PARTKEY
    AND S.SUPPKEY = PS1.SUPPKEY
    AND S.NATIONKEY = N.NATIONKEY
    AND N.REGIONKEY = R.REGIONKEY
    AND R.NAME = ‘EUROPE’
    )
    ORDER BY
    S.ACCTBAL DESC,
    N.NAME,
    S.NAME,
    P.PARTKEY

    I have 28GB of memory on my worker nodes, so memory should not be the problem.
    I see spark UI, but

    Are there any performance tuning options/configurations (may be specific to splicemachine or SPARK) that we should tune to get these queries executed?

    Your suggestions on would be definitely appreciated.

    Thanks,
    Rakesh Jamdare

    #766259

    jyuan
    Moderator

    Did you collect statistics for TPCH tables? To do this, run ‘analyze schema tpch’.
    http://doc.splicemachine.com/Developers/CmdLineReference/CmdAnalyze.html?Highlight=Analyze%20schema

    If you have done so, and the query still hangs. Can you run explain plan and post the results?
    http://doc.splicemachine.com/Administrators/PerformanceTuning/ExplainPlanExamples.html?Highlight=explain%20plan

    #766261

    rakeshj
    Participant

    Hi JYUAN,

    — Corrected query,(I did mess up with inner subquery, so modified it):
    — Minimum Cost Supplier Query–

    SELECT
    S.ACCTBAL,
    S.NAME,
    N.NAME,
    P.PARTKEY,
    P.MFGR,
    S.ADDRESS,
    S.PHONE,
    S.COMMENT
    FROM
    PART P,
    SUPPLIER S,
    PARTSUPP PS,
    NATION N,
    REGION R
    WHERE
    P.PARTKEY = PS.PARTKEY
    AND S.SUPPKEY = PS.SUPPKEY
    AND P.SIZE = 15
    AND P.TYPE LIKE ‘%BRASS’
    AND S.NATIONKEY = N.NATIONKEY
    AND N.REGIONKEY = R.REGIONKEY
    AND R.NAME = ‘EUROPE’
    AND PS.SUPPLYCOST = (
    SELECT MIN(PS.SUPPLYCOST)
    FROM
    PARTSUPP PS1, SUPPLIER S1,
    NATION N1, REGION R1
    WHERE
    P.PARTKEY = PS1.PARTKEY
    AND S1.SUPPKEY = PS1.SUPPKEY
    AND S1.NATIONKEY = N1.NATIONKEY
    AND N1.REGIONKEY = R1.REGIONKEY
    AND R1.NAME = ‘EUROPE’
    )
    ORDER BY
    S.ACCTBAL DESC,
    N.NAME,
    S.NAME,
    P.PARTKEY

    — Query PLAN before ANALYZE SCHEMA SPLICE (ALL TPC-H related tables are in schema named ‘SPLICE’)

    Cursor(n=25,rows=19775,updateMode=,engine=Spark)                                                                                                                        
      ->  ScrollInsensitive(n=24,totalCost=77949.072,outputRows=19775,outputHeapSize=50.91 MB,partitions=1)                                                                 
        ->  OrderBy(n=23,totalCost=43308.371,outputRows=19775,outputHeapSize=50.91 MB,partitions=1)                                                                         
          ->  ProjectRestrict(n=22,totalCost=4333.835,outputRows=19775,outputHeapSize=50.91 MB,partitions=1)                                                                
            ->  BroadcastJoin(n=21,totalCost=4333.835,outputRows=19775,outputHeapSize=50.91 MB,partitions=1,preds=[(N.REGIONKEY[36:20] = R.REGIONKEY[36:21])])              
              ->  TableScan[REGION(1488)](n=20,totalCost=4.054,outputRows=18,outputHeapSize=50.91 MB,partitions=1,preds=[(R.NAME[34:2] = EUROPE)])                          
              ->  BroadcastJoin(n=19,totalCost=4325.4,outputRows=19775,outputHeapSize=50.908 MB,partitions=1,preds=[(S.NATIONKEY[32:11] = N.NATIONKEY[32:18])])             
                ->  TableScan[NATION(1504)](n=18,totalCost=4.054,outputRows=20,outputHeapSize=50.908 MB,partitions=1)                                                       
                ->  BroadcastJoin(n=17,totalCost=4316.894,outputRows=19775,outputHeapSize=50.906 MB,partitions=1,preds=[(P.PARTKEY[28:4] = PS.PARTKEY[28:15]),(PS.SUPPLYCOST[28:17] = AggFlatSub-0-1.SQLCol1[28:1]),(S.SUPPKEY[28:8] = PS.SUPPKEY[28:16])])
                  ->  TableScan[PARTSUPP(1568)](n=16,totalCost=552.583,outputRows=24414,outputHeapSize=50.906 MB,partitions=1)                                              
                  ->  BroadcastJoin(n=15,totalCost=374.09,outputRows=16018,outputHeapSize=16.962 MB,partitions=1,preds=[(S.NATIONKEY[24:11] = AggFlatSub-0-1.NATIONKEY[24:3])])
                    ->  TableScan[SUPPLIER(1536)](n=14,totalCost=4.054,outputRows=20,outputHeapSize=16.962 MB,partitions=1)                                                 
                    ->  BroadcastJoin(n=13,totalCost=365.496,outputRows=19775,outputHeapSize=20.937 MB,partitions=1,preds=[(P.PARTKEY[20:4] = AggFlatSub-0-1.PARTKEY[20:2])])
                      ->  ProjectRestrict(n=12,totalCost=117.881,outputRows=2316,outputHeapSize=20.937 MB,partitions=1,preds=[like(P.TYPE[18:3], %BRASS)])                  
                        ->  TableScan[PART(1472)](n=11,totalCost=113.699,outputRows=4394,outputHeapSize=20.937 MB,partitions=1,preds=[(P.SIZE[18:4] = 15)])                 
                      ->  ProjectRestrict(n=10,totalCost=6211.676,outputRows=24414,outputHeapSize=23.252 MB,partitions=1)                                                   
                        ->  GroupBy(n=9,totalCost=6211.676,outputRows=24414,outputHeapSize=23.252 MB,partitions=1)                                                          
                          ->  ProjectRestrict(n=8,totalCost=6211.676,outputRows=24414,outputHeapSize=23.252 MB,partitions=1)                                                
                            ->  BroadcastJoin(n=7,totalCost=6211.676,outputRows=24414,outputHeapSize=23.252 MB,partitions=1,preds=[(S1.SUPPKEY[12:3] = PS1.SUPPKEY[12:7])]) 
                              ->  TableScan[PARTSUPP(1568)](n=6,totalCost=552.583,outputRows=24414,outputHeapSize=23.252 MB,partitions=1)                                   
                              ->  BroadcastJoin(n=5,totalCost=3316.232,outputRows=360,outputHeapSize=49.453 KB,partitions=1,preds=[(N1.REGIONKEY[8:5] = R1.REGIONKEY[8:1])])
                                ->  TableScan[NATION(1504)](n=4,totalCost=4.054,outputRows=20,outputHeapSize=49.453 KB,partitions=1)                                        
                                ->  NestedLoopJoin(n=3,totalCost=3307.81,outputRows=360,outputHeapSize=47.812 KB,partitions=1)                                              
                                  ->  TableScan[SUPPLIER(1536)](n=2,totalCost=4.054,outputRows=20,outputHeapSize=47.812 KB,partitions=1)                                    
                                  ->  TableScan[REGION(1488)](n=1,totalCost=4.054,outputRows=18,outputHeapSize=1.969 KB,partitions=1,preds=[(R1.NAME[0:2] = EUROPE)])     
    
    25 rows selected
    

    — Query PLAN after running ANALYZE SCHEMA SPLICE

    Cursor(n=25,rows=158224,updateMode=,engine=Spark)                                                                                                                       
      ->  ScrollInsensitive(n=24,totalCost=4065792.435,outputRows=158224,outputHeapSize=20.584 MB,partitions=16)                                                            
        ->  OrderBy(n=23,totalCost=2735090.724,outputRows=158224,outputHeapSize=20.584 MB,partitions=16)                                                                    
          ->  ProjectRestrict(n=22,totalCost=2495926.463,outputRows=158224,outputHeapSize=20.584 MB,partitions=16)                                                          
            ->  MergeSortJoin(n=21,totalCost=2495926.463,outputRows=158224,outputHeapSize=20.584 MB,partitions=16,preds=[(P.PARTKEY[36:4] = AggFlatSub-0-1.PARTKEY[36:21]),(PS.SUPPLYCOST[36:3] = AggFlatSub-0-1.SQLCol1[36:20])])
              ->  ProjectRestrict(n=20,totalCost=2099.321,outputRows=195313,outputHeapSize=768.814 KB,partitions=1)                                                         
                ->  GroupBy(n=19,totalCost=2099.321,outputRows=195313,outputHeapSize=768.814 KB,partitions=1)                                                               
                  ->  ProjectRestrict(n=18,totalCost=2099.321,outputRows=195313,outputHeapSize=768.814 KB,partitions=1)                                                     
                    ->  BroadcastJoin(n=17,totalCost=2099.321,outputRows=195313,outputHeapSize=768.814 KB,partitions=1,preds=[(N1.REGIONKEY[30:6] = R1.REGIONKEY[30:7])])   
                      ->  TableScan[REGION(1488)](n=16,totalCost=4.009,outputRows=1,outputHeapSize=768.814 KB,partitions=1,preds=[(R1.NAME[28:2] = EUROPE)])                
                      ->  BroadcastJoin(n=15,totalCost=2091.303,outputRows=976563,outputHeapSize=3.754 MB,partitions=1,preds=[(S1.NATIONKEY[26:4] = N1.NATIONKEY[26:5])])   
                        ->  TableScan[NATION(1504)](n=14,totalCost=4.048,outputRows=25,outputHeapSize=3.754 MB,partitions=1)                                                
                        ->  BroadcastJoin(n=13,totalCost=2083,outputRows=976563,outputHeapSize=3.754 MB,partitions=1,preds=[(S1.SUPPKEY[22:3] = PS1.SUPPKEY[22:2])])        
                          ->  TableScan[SUPPLIER(1536)](n=12,totalCost=28,outputRows=10000,outputHeapSize=3.754 MB,partitions=1)                                            
                          ->  TableScan[PARTSUPP(1568)](n=11,totalCost=1948,outputRows=800000,outputHeapSize=3.052 MB,partitions=1)                                         
              ->  BroadcastJoin(n=10,totalCost=2714.779,outputRows=2441,outputHeapSize=313.369 KB,partitions=1,preds=[(N.REGIONKEY[16:17] = R.REGIONKEY[16:18])])           
                ->  TableScan[REGION(1488)](n=9,totalCost=4.009,outputRows=1,outputHeapSize=313.369 KB,partitions=1,preds=[(R.NAME[14:2] = EUROPE)])                        
                ->  BroadcastJoin(n=8,totalCost=2706.76,outputRows=12207,outputHeapSize=1.53 MB,partitions=1,preds=[(S.NATIONKEY[12:11] = N.NATIONKEY[12:15])])             
                  ->  TableScan[NATION(1504)](n=7,totalCost=4.048,outputRows=25,outputHeapSize=1.53 MB,partitions=1)                                                        
                  ->  BroadcastJoin(n=6,totalCost=2698.44,outputRows=12207,outputHeapSize=1.53 MB,partitions=1,preds=[(S.SUPPKEY[8:8] = PS.SUPPKEY[8:2])])                  
                    ->  TableScan[SUPPLIER(1536)](n=5,totalCost=28,outputRows=10000,outputHeapSize=1.53 MB,partitions=1)                                                    
                    ->  MergeJoin(n=4,totalCost=2426.44,outputRows=10000,outputHeapSize=163.355 KB,partitions=1,preds=[(P.PARTKEY[4:4] = PS.PARTKEY[4:1])])                 
                      ->  ProjectRestrict(n=3,totalCost=442.123,outputRows=2359,outputHeapSize=163.355 KB,partitions=1,preds=[like(P.TYPE[2:3], %BRASS)])                   
                        ->  TableScan[PART(1472)](n=2,totalCost=442,outputRows=3336,outputHeapSize=163.355 KB,partitions=1,preds=[(P.SIZE[2:4] = 15)])                      
                      ->  TableScan[PARTSUPP(1568)](n=1,totalCost=1948,outputRows=800000,outputHeapSize=6.104 MB,partitions=1)                                              
    
    25 rows selected

    Exception while running query:
    ERROR 58009: Network protocol exception: DSS length not 0 at end of same id chain parse. The connection has been terminated.
    ERROR 08006: DERBY SQL error: SQLCODE: -1, SQLSTATE: 08006, SQLERRMC: A communications error has been detected: {0}.::SQLSTATE: XJ001Java exception: ‘: java.lang.NullPointerException’

    Link to exception in one of stages in spark job – stack trace:
    http://shorttext.com/848bba20

    Besides this, just for your information I had raised an issue @ http://community.splicemachine.com/community/forum/topic/error-starting-splice-machine-in-region-server/ and have resolved it, but unable to reply to the post. I wanted to share the solution for that issue. Below is solution to it:
    •Exception in brief:
    Splice machine services checks for master initialization while being initialized with limited (configurable) retry attempts, but fails once retry count finishes/exhausts.
    •Solution
    Increase retry count in hbse configuration from Ambari console to 30
    hbase.client.retries.number = 30

    • This reply was modified 2 weeks, 6 days ago by  Splice Admin.
    #766265

    jyuan
    Moderator

    Hi Rakeshj,

    Cann’t access the Spark log now. Can you share with us again?

    #766341

    dhanyavv
    Participant

    hi

    Whether this issue is resolved, even i am getting the same Derby error while trying to do count(*) from a table with 128 million records. Can you kindly help please.

    #766342

    jyuan
    Moderator

    Dhanyavv,

    Can you file a Jira to https://splice.atlassian.net, so that we can keep track of it? Please describe the environment and steps to reproduce the issue, and upload log files that contains the exception? We will look into it as soon as possible. Thanks!

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.